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Abstract 

The  condition  number  of  a  problem  measures  the  sensitivity  of  the  answer  to  small  changes 
in  the  input.  We  call  the  problem  ill-posed  if  its  condition  number  is  infinite.  It  turns  out 
that  for  many  problems  of  numerical  analysis,  there  is  a  simple  relationship  between  the 
condition  number  of  a  problem  and  the  shortest  distance  from  that  problem  to  an  ill-posed 
one:  the  shortest  distance  is  proportional  to  the  reciprocal  of  the  condition  number  (or 
bounded  by  the  reciprocal  of  the  condition  number).  This  is  true  for  matrix  inversion, 
computing  eigenvalues  and  eigenvectors,  finding  zeros  of  polynomials,  and  pole  assignment 
in  linear  control  systems.  In  this  paper  we  explain  this  phenomenon  by  showing  that  in  all 
these  cases,  the  condition  number  k  satisfies  one  or  both  of  the  differential  inequalities 
m-K2  s  ||Dk||  s  A/k2,  where  ||Dk||  is  the  norm  of  the  gradient  of  k.  The  lower  bound  on 
||Dk||  leads  to  an  upper  bound  l/(mK(x))  on  the  distance  from  x  to  the  nearest  ill-posed 
problem,  and  the  upper  bound  on  ||Dk||  leads  to  a  lower  bound  V(Mk(x))  on  the  distance. 
The  attraction  of  this  approach  is  that  it  uses  local  information  (the  gradient  of  a  condition 
number)  to  answer  a  global  question:  how  far  away  is  the  nearest  ill-posed  problem?  The 
above  differential  inequalities  also  have  a  simple  interpretation:  they  imply  that  computing 
the  condition  number  of  a  problem  is  approximately  as  hard  as  computing  the  solution  of  the 
problem  itself.  In  addition  to  deriving  many  of  the  best  known  bounds  for  matrix  inversion, 
eigendecompositions  and  polynomial  zero  finding,  we  derive  new  bounds  on  the  distance  to 
the  nearest  polynomial  with  multiple  zeros  and  a  new  perturbation  result  on  pole  assignment. 

Classifications:  AMS(MOS):  15A12,  15A60,  65F35;  CR:  F.2.1,  G.1.0 


1.  Introduction 

The  condition  number  of  a  problem  measures  the  sensitivity  of  the  answer  to  small 
changes  in  the  input.  We  call  the  problem  ill-posed  if  its  condition  number  is  infinite.  The 
ill-posed  problems  typically  form  a  lower  dimensional  surface  in  the  space  of  problems.  It 
turns  out  that  for  many  problems  of  numerical  analysis,  there  is  a  simple  relationship 
between  the  condition  number  of  a  problem  and  the  shortest  distance  from  that  problem  to 
the  surface  of  ill-posed  ones:  the  shortest  distance  is  proportional  to  the  reciprocal  of  the 
condition  number,  or  bounded  by  the  reciprocal  of  the  condition  number.  Sometimes,  the 
distance  is  bounded  below  by  the  reciprocal  of  the  condition  number  squared.  This  is  true 
for  matrix  inversion,  computing  eigenvalues  and  eigenvectors,  finding  zeros  of  polynomials, 
and  pole  assignment  in  linear  control  systems. 

For  example,  in  the  case  of  matrix  inversion,  if  A  is  perturbed  to  A  +  8A,  then  to  first 
order  the  solution  A  ~ '  becomes  A  ~  l+  X  where 


JjxjL 


llA-'lHloAH 


l^ll 
(llll  is  any  operator  norm).  Thus  the  condition  number  of  this  problem  may  be  taken  as 

||A_,||.  It  is  well  known  [8]  that  the  shortest  distance  from  A  to  the  surface  of  singular  (ill- 
posed)  matrices  is  1/||A_1||.  Similar  results  for  computing  eigendecompositions  and  zeros  of 
polynomials  are  due  to  Wilkinson  [16]  and  Hough  [7]  and  will  be  discussed  further  below. 

In  this  paper  we  explain  this  phenomenon  and  unify  the  techniques  used  to  obtain  these 
results  by  showing  that  in  all  these  cases,  the  condition  number  k  satisfies  one  or  both  of  the 
following  differential  inequalities 

m-K2  £  ||Dk||s  Mk2  (1.1) 

almost  everywhere,  where  0<m£Af  and  \\Dk\\  is  the  norm  of  the  gradient  of  k.    From  the 

lower  bound  on  ||Dk||  we  will  deduce  that  there  is  a  curve  x(s)  (the  "steepest  ascent"  curve  of 

k),  parameterized  by  arclength  s  and  with  x(0)  =  x,  such  that 


i«(i(;))srK2(i(»))    • 
as 

This  last  inequality  can  be  integrated  explicitly  (see  Lemma  1  below),  yielding  an  upper 
bound  on  the  distance  dist(x.P)  (which  depends  on  ||||)  from  x  to  the  set  P  of  ill-posed 
problems 

dist(x,/>)  £  l—    . 

mic(x) 

From  the  upper  bound  on  ||£>k||  we  will  deduce  that  if  x(s)  is  any  smooth  curve 
parameterized  by  arclength  with  x(0)  =  x,  then 

■j-  k(x(*))  s  Mk2(*(*))    • 

as 

This  inequality  can  also  be  integrated  explicitly  (Lemma  2),  yielding  a  lower  bound 

— -1— ■  £  dist(x,J>) 

m-k(x) 

on  the  distance  to  the  nearest  problem  with  an  infinite  condition  number. 

For  some  problems  we  can  prove  a  differential  inequality  of  the  form  ||Dk||  s  A/k3 
which  yields  a  lower  bound  on  the  distance  1  /  (2A/k2(x))  s  dist(x,P). 

The  attraction  of  this  approach  is  that  it  uses  purely  local  information  (the  norm  of  the 
gradient  of  the  condition  number)  to  answer  a  global  question:  how  far  away  is  the  nearest 
point  in  a  (generally  quite  complicated)  set  of  ill-posed  problems?  This  approach  is  quite 
similar  in  spirit  to  Wilkinson's  "fast  perturbation  theory"  for  eigenvalues  [17],  with  which  we 
compare  our  method  in  section  4  below.  In  fact,  we  argue  in  section  4  that  the  idea  behind 
using  inequalities  (1.1)  to  get  distance  estimates,  doing  steepest  ascent  on  the  condition 
number  k,  generalizes  Wilkinson's  fast  perturbation  theory  to  a  numerical  approach 
applicable  to  a  large  class  of  problems. 

The  differential  inequalities  (1.1)  have  two  simple  and  attractive  interpretations.  To 
state  the  first  one  we  need  to  define  the  relative  condition  number  of  the  mapping  g  at  x  as 

k    ,<*  x)  =  limsup   ll*(«+»*)-g(«)IMI*(«)ll 


=  llPgQOIHMI 
11*0011      ' 

where  the  second  definition  is  only  true  if  the  Frdchet  derivative  Dg  of  g  exists.  As  is  well 
known,  k„,  measures  the  maximum  instantaneous  relative  change  in  g  per  relative  change  in 
x.  It  is  easy  to  see  that  by  multiplying  (1.1)  by  ||ar||  /  k(x),  we  get 

m-K(x)  =s  Krel(K,x)  £  Mk(x)     if  |jx||=l    .  (1.2) 

Inequalities  (1.2)  mean  that  solving  the  problem  x,  normalized  so  ||x||=  1,  is  essentially  just  as 

hard  (within  factors  m  and  M )  as  computing  the  condition  number  k  of  the  problem  x.   If  we 

further  assume  that  k   is  homogeneous,   i.e.     k(ox)  =  a*K(x)  for  all  real  positive  a,  then 

inequalities   (1.1)   and   (1.2)   can   be   shown   nearly   equivalent   (see  section  7).   The   near 

equivalence  of  inequalities   (1.1)    and   (1.2)   is   very   satisfying  because   it  says  that  if  the 

condition  number  k  has  the  utterly  reasonable  property  of  being  just  as  hard  to  compute  as 

the  solution  x  itself,  then  it  has  the  attractive  geometric  property  of  being  the  reciprocal  of 

the  distance  to  the  nearest  infinitely  ill-conditioned  problem.  Indeed,  the  common  formulas 

for  relative  condition  numbers  (e.g.  ||A||||A_1||  for  matrix  inversion)  lead  one  to  believe  that 

one  must  solve  the  problem    (e.g.   compute  A-1)   to  within   reasonable  accuracy  to  get  a 

reasonably  accurate  condition  number.    This  intuition  is  corroborated  by  the  results  in  this 

paper. 

The  second  interpretation  of  (1.1)  is  as  a  restatement  of  Newton's  method.  This 
interpretation  applies  only  when  the  mapping  g,  which  maps  a  problem  to  its  solution,  has  as 
domain  and  range  either  the  real  or  the  complex  numbers,  and  is  smooth  except  for  poles. 
For  example,  g  may  be  a  rational  function  of  a  single  real  or  complex  variable.  As  condition 
number  we  take  the  absolute  condition  number,  which  is  just  a  multiple  of  the  relative 
condition  number  defined  above: 

k  h  (g  x)  =  umsup  ll*('+»«W(*)ll'll*(*)ll  =  IJPirMII 

Kabs{g,x)       umsup  M  |k(x)|| 

Since  both  domain  and  range  are  one  dimensional,  Kabs(g,x)  can  be  written  as  \g'(x)  I  g(x)\. 
If  g  is  smooth  except  at  poles,  the  condition  number  can  be  infinite  only  at  poles  and  zeros  of 


g- 

The  problem  of  finding  the  distance  to  the  nearest  ill-posed  problem  thus  becomes  the  zero 
(or  pole)  finding  problem.  If  we  are  close  enough  to  a  zero,  we  expect  the  absolute  value  of 
the  Newton  correction  gig'  to  be  a  good  estimate  of  the  distance  to  the  nearest  zero.  But 
\g/g'\  is  just  the  reciprocal  of  the  condition  number  Kabs.  In  section  8  we  show  that  (1.1)  will 
asymptotically  hold  with  m  =  M  =  1  in  a  sufficiently  small  neighborhood  of  the  set  of  ill-posed 
problems,  thus  yielding  the  Newton  correction  as  the  correct  distance  estimate.  This  also 
works  in  the  neighborhood  of  multiple  zeros  and  poles. 

This  connection  with  Newton's  method  becomes  weaker  when  the  domain  and  range  of 
g  are  just  linear  spaces  of  the  same  dimension  greater  than  one.  Nonetheless,  there  is  a 
connection  which  we  also  explore  in  section  8. 

For  eigenvalue  problems,  polynomial  zero  finding,  and  pole  placement  this 
interpretation  does  not  apply.  The  reasons  are  twofold:  first,  the  problem  space  and  solution 
space  are  of  different  dimensions,  and  ill-posed  problems  occur  not  at  zeros  and  poles  but  at 
branch  points.  It  is  natural  to  ask  if  some  general  statement  can  be  made  about  the  condition 
number  and  distance  to  the  nearest  branch  point.  It  turns  out  we  can  show  that  in  a 
sufficiently  small  neighborhood  of  a  branch  point  of  any  algebraic  function,  the  distance  is 
bounded  below  by  a  multiple  of  the  reciprocal  of  the  square  of  the  condition  number.  We 
show  this  in  section  9.  This  result  is  reflected  in  gaps  between  the  best  known  upper  and 
lower  bounds  on  the  shortest  distance  for  eigenvalue  problems  (section  4)  and  polynomial 
zero  finding  (section  5). 

The  rest  of  the  paper  is  organized  as  follows.  In  section  2  we  present  our  differential 
inequalities  and  solve  them.  Sections  3  through  6  cover  matrix  inversion, 
eigendecompositions,  polynomial  zero  finding,  and  pole  assignment.  Sections  3  through  6 
may  be  read  independently.  The  results  on  matrix  inversion  and  some  of  the  results  on 
eigendecompositions  are  known,  but  others  are  new.  Much  related  work  on  the  eigenvalue 


problem  has  been  done  by  Wilkinson  [17,  18]  and  we  compare  our  approaches  in  section  4. 
One  of  our  upper  bounds  on  the  distance  from  a  polynomial  to  one  with  a  double  zero  is 
known  but  another  is  new.  Our  lower  bound  on  this  distance  is  new.  Our  results  on  pole 
assignment  are  also  new.  Section  7  discusses  the  equivalence  of  inequalities  (1.1)  and  (1.2) 
when  the  condition  number  is  homogeneous.  In  section  8  we  discuss  the  connection  with 
Newton's  method  mentioned  above.  In  section  9  we  show  that  when  the  solution  of  the 
problem  is  any  algebraic  function  we  expect  a  lower  bound  on  the  distance  in  terms  of  the 
reciprocal  of  the  square  of  the  condition  number.    Section  10  discusses  extensions. 


2.  Differential  Inequalities 

The  differential  inequalities  we  need  are  given  the  following  lemmas.  The  first  one  will 
be  used  to  derive  an  upper  bound  on  the  distance  to  the  nearest  ill-posed  problem  in  terms  of 
the  condition  number. 

Lemma  1:  Suppose  m>0,  y0>0,  a>l  and 

jLy(s)  *  m  y*(s)       ,     y(0)  =  y0   . 
as 

Then  y(s)  becomes  infinite  for  some  s  satisfying 

1 


0  <  s  < 


(*-l)m-yg-x 

Proof:  The  differential  inequality  implies  y  is  positive  and  strictly  increasing,  so  by  a  standard 
result  in  the  theory  of  ordinary  differential  equations  [6,  Thm.  III. 4.1]  it  is  bounded  below  by 
the  solution  of 

j-z(s)  =  m  z°(s)        ,       z(0)  =  y0 
which,  as  easily  verified,  is 

vo 


r(*)  = 


(1-  (a-^myo""1  .)(«-»"' 
Since  z(s)  has  a  pole  at  l/((a-l)  m  Vq-1),  y(s)  must  have  a  pole  before  that.  Q.E.D. 

Now  suppose  k  were  continuously  differentiable  wherever  it  was  finite  and  that  the 
norm       ||Z)k(x)||      were      an      operator      norm      induced      by      some      vector      norm: 

||Z>k(x)||  =   sup  Dk(x)  y-    Suppose  further  that  Dk(x)  had  a  continuous  dual  vector  field 

IMI=i 

y(x).  In  other  words  y(x)  should  satisfy  |[y(x)||=l  and  Z)K(x)y(x)  =  ||Dk(jc)||.  Then  we 
could  define  a  curve  x(s),  parameterized  by  arclength  (||*(s)||=  1),  as  the  integral  curve  of 
the  vector  field  y(x)  passing  through  x(0)  =  x.  The  curve  x(s)  is  simply  the  curve  along  which 
k  increases  most  rapidly  at  each  point  (the  curve  of  "steepest  ascent").  If  Dk(x)  satisfied 
||Z)k(x)||  &  otk2(x),  then  by  the  chain  rule  k(x(j))  would  satisfy 


-f  k(x(j))  =  Dk(x(*))x«  =  ||Ok(x(*))||  =  m-K^x(s)) 
as 

so  by  Lemma  1  l/(mic(x))  would  be  an  upper  bound  on  the  distance  in  the  ||||  norm  from  x 
to  the  nearest  ill-posed  problem.  Unfortunately,  k  is  not  everywhere  continuously 
differentiable  for  the  problems  we  consider  in  this  paper.  Nonetheless,  we  will  see  that  it  is 
always  smooth  enough  to  construct  a  smooth  curve  x(s)  along  which  k(x(j))  increases 
sufficiently  fast  to  apply  Lemma  1. 

The  next  differential  inequality  will  yield  a  lower  bound  on  the  distance  to  the  nearest 
ill-posed  problem  in  terms  of  the  condition  number.  First  define  the  right  derivate  of  a 
continuous  function  /  as 

D  +/(*)  -  limsiip  lS£±KhJl*l    . 

A-0+  h 

Lemma  2:  Suppose  M>0,  y(s)>0  for  all  s,  y0>0,  p>l  and 

D+y(s)  =s  m  y\s)       ,      y(0)  =  y0    . 
Then  y{s)  is  finite  for 

Os  s  < 


(p-l)-M-yr1 
Proof:  Since  y(s)  is  positive,  the  same  standard  result  as  used  in  the  proof  of  Lemma  1 

implies  that  it  is  bounded  above  by  the  solution  of 

jjz(s)  =  M  zHs)       ,       *(0)  =  y0 
which,  as  in  the  last  lemma,  is 


z(s)  = 


(l-  O-DMyg-^ye-')-1 

Since  z(s)  is  finite  for  all  s  less  than  l/((p-  i)  M  y§_1),  so  is  y(s).  Q.E.D. 

Now  suppose  k  were  continuously  differentiable  wherever  it  was  finite  and  satisfied 
||Dk(x)||  s  W-k2(x),  the  norm  on  Dk  an  operator  norm  as  before.  Then  for  every  smooth 
curve  x(s)  parameterized  by  arclength  and  passing  through  x(0)  =  x,  k(x(j))  would  by  the 
chain  rule  satisfy  the  conditions  of  Lemma  2,  thus  yielding  a  lower  bound  1/(Mk(x))  on  the 

8 


distance  from  x  to  the  nearest  ill-posed  problem.  As  mentioned  above,  k  is  not  continuously 
differentiable  for  the  problems  we  consider,  but  it  is  smooth  enough  to  satisfy  the  constraints 
of  Lemma  2  for  smooth  curves  x(s). 


3.  Matrix  Inversion 

In  this  section  II- II  will  denote  an  arbitrary  vector  norm,  ||-||D  the  dual  norm: 


and  ||A||  the  induced  matrix  norm: 


*-'"» "  wtn 


w-s*W- 


Let  P  be  the  set  of  singular  matrices.  Let  dist(A,P)  denote  the  minimum  distance  from  the 
matrix  A  to  the  set  P:  dist(A,/>)  =  inf  |lA-5||. 

SiP 

As  discussed   in   the  introduction,    ||A_1||  is  a  condition   number  for  the  problem  of 
inverting  the  matrix  A .  This  is  true  because  to  first  order 

(A  +  8A)"1  =  A"1  -  A-'SAA-'  +  0(||8A||2)  =  A"1  +  X 
so  that  for  small  8A 

-JM--  ||A-i||.||oA||    . 


IIA"1!! 
The  following  result  is  originally  due  to  Eckart  and  Young  [4]  when  ||||  is  the  Euclidean 

norm  and  to  Gastinel  [8]  for  arbitrary  norm: 

Theorem  1:  (Gastinel)  Let  P  be  the  set  of  singular  matrices.  Then 

dist(A,/>)  =  llA-'H"1    , 
i.e.  the  reciprocal  of  the  condition  number  ||A_1||  of  the  problem  of  inverting  A. 

Proof:  If  HBAl^llA-1!!-1  then  A  +  8A  is  invertible  since  (A +  8A)_1=(/  +  A-18A)-1A_1  and 
||A_18A||  £  ||A-1||  ||8A||  <  1.  Therefore  dist(A ,/>):£ ||A_i||-1.  To  show  equality  holds  choose 
x  and  y  such  that  ||jr||  =  ||yr||£,  =1  and  yTA~lx=  ||A_1||  (the  existence  of  x  and  y  follows  from 
the  definitions  of  the  norms).  Let  8A  =  -  llA-1!!-1  xyT.  Clearly  ||8A||  =  llA-1!!-1-  To  see 
A  +  8A  is  singular  note  that  (A  +  8A)(A-1x)  =  0.  Q.E.D. 

We  now  prove  this  theorem  using  Lemmas  1  and  2.  This  alternate  proof  is  no  simpler 
than  the  above  one,  but  illustrates  the  techniques  we  use  later. 


10 


Theorem  2:  Let  P  be  the  set  of  singular  matrices.  Then 

dist(A,/>)  =   HA-1!!"1    , 
Proof:  To  show  dist(A,P)  2:  HA-1!!-1  let  A(s)  be  any  smooth  path  from  A(0)  =  A  to  A(s0)£P 
parameterized  by  arclength  (i.e.  ||A(.s)||  =  1).  Then 


D*|H-tW||  =  Hmsup   l|A-'(*  +  *)H-||A-V)ll 
A-0+  h 

m  limsup    \\(A(s)  +  hA(s»-l\\-  \\A-Hs)\\ 

A-0+  h 

=  Umsu      \\A~Hs)  -  hA-Hs)A(s)A-H*)\\  ~  \\A-H*)\\ 

A-0+  h 

£  limsup    H*A-'(*)A(i)A-'(,)||   *   |ki-i(t)|p    . 
A-0+  h 

Applying  Lemma  2  with  M  =  \  and  (3  =  2  implies  |[A~'(j)||  remains  finite  for  *<||A-1||-1. 
Since  the  path  A(s)  from  A  to  P  was  arbitrary,  we  have  dist(A,P)  &  ||A_1||_1. 

To  prove  the  opposite  inequality  we  need  to  choose  a  path  A (s)  along  which  |[A_1(i)|| 
increases  as  quickly  as  possible,  i.e.  we  need  an  integrable  vector  field  X(A),  ||X(A)||=1, 
where  ||A"1X(A)A_1||  =  ||A_1||2.  Let  x(A)  and  y(A)  be  defined  as  in  Theorem  1: 
ll*(A)||  =  l|yr04)llz>  =  1  and  yT(A)A ~lx(A)  =  ||A_1||.  Now  let  X(A)  =  x(A)  yT(A).  Assume 
for  the  moment  that  X(A)  is  integrable,  and  let  A(s)  be  an  integral  curve  parameterized  by 
arclength  such  thatA(0)  =  i4  and  |[/4_1(.r)||  is  increasing.  We  will  show  that 

£||A-»(j)||-  ||A-»(#)IP 
so  by  Lemma  1  (with  m  =  1  and  a  =  2)  ||A_1(j)||  becomes  infinite  for  s=  |lA-1||_1  as  desired. 

We  show  X(A)  is  integrable  by  integrating  it  explicitly.  Its  integral  curves  are  straight 
lines  as  they  must  be  since  they  are  geodesies  in  a  normed  linear  space.  To  prove  this  it 
suffices  to  show  that  if  ||x||  =  Wy7]^  =  1  and  yTA~lx  =  ||A-1||,  then 
yT(A-sxyT)x  =  ||(A - jryr)_1||  for  s  sufficiently  small.  This  follows  from  the  Sherman- 
Morrison  formula  [5] 
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l„,7->.  -1 


SO 


(A-sxy1)    >  =  A       + *—_ - 

1  -  sy'A    '. 


and 


y  (A  —  sxy')    lx  =  y 


x  =  yTA-*x  +  ^ 


^A-'xHvrA-M    =   1U.M|  tlJA-'IP       =  IIA-'II 

1  -  syTA-'*  1  "  *I|A-MI  1  -  A\A~l\ 


so 


yT(A-sxyT)-*x  =  ||(A-«y»)-MI  =    ,     l|A",'11  „, 

1  -  « ||A    Ml 


Finally,  differentiating  we  see 


-S-UiA-sxyT)-^ 


i=0 


d       ||A-'1I 
*    l-*|lA-'| 


=  Iki-'ll2 


ti=0 


as  desired.   Q.E.D. 


In  this  proof,  we  explicitly  integrated  the  vector  field  X(A)  in  order  to  show  integral 
curves  existed.  This  is  not  generally  possible  or  desirable,  and  in  our  later  examples  we  only 
show  that  X(A)  is  continuous,  which  is  sufficient  for  integrability. 
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4.  Eigenvalue  and  Eigenvector  Computations 

In  this  section  we  consider  the  problem  of  computing  a  simple  eigenvalue  or 
corresponding  eigenvector  of  a  general  matrix  T.    In  this  section  we  let  ||-||  denote  the  2- 

norm  ||x||  =  (£  |*,|2)1/2> 

IIT-H  =  sup  JJM 

Wn  xl%      ||x||        ' 

and  ||r||f  =  (2l7"yl2)1/2-   Let  ^  be  a  simple  eigenvalue  of  T,  x  its  right  eigenvector  and  yT  its 

left  eigenvector,  where  we  normalize  so  that  yTx=\.  The  projector  P  belonging  to  X  is 
defined  as  xyT  and  has  norm  ||P||  =  ||x||  ||yr||.  It  is  well  known  [15]  that  if  we  perturb  T  by 
87",  X  can  change  at  most  by  |BX|  s  \\p\\  ||oT||  for  small  87",  and  that  this  bound  is  attainable. 
Therefore,  we  call  ||i»||  the  condition  number  of  the  eigenvalue  X.  It  is  also  known  [9,  16] 
that  the  distance  from  T  to  a  matrix  which  has  a  double  eigenvalue  (at  X)  is  bounded  by 
l|r||/(||P|p-l)1/2,  or  approximately  ||r||/||P||  for  large  ||P||.  An  n -tuple  eigenvalue  X  is 
infinitely  ill-conditioned  because  a  perturbation  of  size  e  in  the  matrix  can  change  X  by  tVn, 
whose  derivative  at  c  =  0  is  infinite.  Therefore  we  may  take  the  set  of  matrices  with  multiple 
eigenvalues  as  our  surface  of  ill-posed  problems.  Thus  the  reciprocal  of  the  condition  number 
||P||  bounds  the  relative  distance  to  the  nearest  infinitely  ill-conditioned  problem.  We  will 
obtain  this  result  using  Lemma  1. 

Similar  considerations  show  that  the  same  surface  is  the  set  of  ill-conditioned  problems 
when  computing  eigenvectors,  although  the  condition  number  for  eigenvectors  differs  from 
the  one  for  eigenvalues.  It  is  known  that  the  reciprocal  of  the  eigenvector  condition  number 
is  a  lower  bound  on  the  distance  to  the  nearest  matrix  with  multiple  eigenvalues  [2,  14].  We 
will  obtain  this  result  using  Lemma  2. 

There  can  be  quite  a  gap  between  these  upper  and  lower  bounds  on  the  distance  to  the 
nearest  matrix  with  a  multiple  eigenvalue.  Several  authors  [2,  9,  16,  17,  18]  have  explored 
the  geometry  of  the  set  of  matrices  with  multiple  eigenvalues,  and  attempted  to  find  simple 
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ways  to  measure  the  distance  from  a  given  matrix  to  that  set,  but  large  gaps  still  remain 
between  the  best  known  upper  and  lower  bounds.  Wilkinson  [17,  18]  in  particular  has  pointed 
out  the  inadequacy  of  the  current  bounds,  and  suggested  a  numerical  approach  to  find  the 
distance  from  any  given  matrix  to  the  nearest  matrix  with  multiple  eigenvalues.  His  method 
also  depends  on  using  perturbations  which  cause  the  eigenvalues  to  rush  together  and 
increase  their  condition  numbers  nearly  as  quickly  as  possible  (he  calls  this  "fast  perturbation 
theory");  a  major  point  of  our  work  is  that  this  approach  can  be  used  on  a  wide  variety  of 
problems,  not  just  the  eigenproblem.  Such  a  numerical  method  may  find  a  much  closer  ill- 
posed  problem  then  the  bounds  provided  by  our  theorems  can  guarantee.  We  will  point  out 
the  relation  between  our  approach  and  his  as  we  go,  and  and  at  the  end  we  will  summarize 
and  compare  the  various  bounds  in  the  literature.  We  will  also  propose  an  explanation  of  the 
gap  between  our  upper  and  lower  bounds  as  a  feature  of  any  algebraic  function  (see  also 
section  9). 

First  we  need  some  notation.  Since  our  matrix  norm  is  invariant  under  orthogonal 
transformations,  we  may  assume  without  loss  of  generality  that  our  matrix  T  is  in  Schur 
canonical  form  [5]: 


X   xT 
0    B 


T  = 

Occasionally  we  will  write  X(r)  to  emphasize  the  (continuous)  dependence  of  X  on  T.  Let 
r=  (fl-X)-7x.  It  is  easy  to  show  that  in  this  coordinate  system  the  right  and  left 
eigenvectors  of  X  may  be  written  x  =  [1,0,  .  .  .  ,0]r  and  yT  =  [1,— r  ],  so  that 
(|[P|p-l)1/2  =  ||r||.  It  is  to  this  last  quantity  we  shall  apply  Lemma  1.  If  we  perturb  T  to 
T+&T,  with 

srH  er12 


o7~2)     o7~22 


Br  = 

partitioned  conformally  with  T,  then  to  first  order  in  87"  P  is  perturbed  to  [10] 
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P  +  8P  =  P  +  SSTP  +  PSTS 
where  S  is  the  reduced  resolvent,  or 


(4.1) 


S  =  lim  (I-P)(T-z)~l  = 
in  this  coordinate  system.  Expanding  (4.1)  yields 


0  SiB-xy1 

0     (B-X)-' 


P  +  &P  = 


1    -rT 
0      0 


rT(B-k)-lbT2l 

(s-\)->8r21 


(4.2) 


hTurT(B-\)~l  +  &Tl2(B-\yl  -  rTbT22(B-kyl  -  rT(B -\yl&T21rT  -  rThT2lrT{B  -X)"1 

-(B-\y*hT2lrT 

We  consider  two  perturbations,  one  where  only  87"22  is  nonzero  (this  is  the  perturbation 
used  in  [16])  and  one  where  only  8r21  is  nonzero.  It  is  easy  to  find  the  smallest  perturbation 
of  87"22  that  makes  T+hT  have  a  double  eigenvalue  at  X: 

Theorem  3:  [Wilkinson]  If  ||P||>1  then  there  exists  a  8T  with  only  87"22*0  such  that  T  +  ST 
has  a  double  eigenvalue  at  X  and 

1187-11 


(||P  IP-  1)1* 
Proof:  The  87"22  we  seek  is  the  smallest  perturbation  such  that  B-k+bT22  is  singular,  which 

has  norm  ||8r22||  =  IKS  —  X)"1)!-1-  But  since  r=(B-X)"rx, 

||r||s  ||r||||(*-X)-i||s  ||r||||(B-X)-'||or 


||fr-X)-»||-».«-— Ifcll— 

IHI       (||p||2-i)I/2 


as  desired.  Q.E.D. 


As  pointed  out  by  Wilkinson,  this  upper  bound  can  be  much  weaker  than  the  simpler 
upper  bound  ||(B-X.)~l||-1.  For  example,  if 


X   xT 
0    B 


1  +  e    0 
0      1 


the  upper  bound  of  Theorem  3  is  infinite  and  ||(B-X)-1||_1  is  e.  Nonetheless,  it  does  provide 
a  one-sided  inequality  between  the  shortest  distance  and  the  reciprocal  of  the  condition 
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number. 

Now  we  prove  this  result  using  Lemma  1: 
Theorem  4:  If  ||B||>1  then  there  exists  a  67"  with  only  8r22*0  such  that  T+&T  has  a  double 
eigenvalue  at  X.  and 

1187-11  =s  l£U . 

Proof:  To  apply  Lemma  1  when  only  8r22*0  we  need  to  compute 


(||/>  +  6P||2-  1)1/2-  (|[P||2-  l)1/2=  (|| 


1    -rT-rThT22{B-\)-1 
0  0 


II2  "  1)1/2  "  (II 


1    -rT 
0      0 


a  _  urn 


II2  "  1) 


Ml  -   lkr| 


=   ||rr  +  rr8r22(5-X)- 
when  ||8r||  approaches  zero.  Letting  ru=r/||r||,  it  is  easy  to  see  this  last  expression  is 

Re||r||rur8r22(B-X)-1Fu 
to  first  order  in  87"22.    Now  since  r  =  (B-X)_rx, 

||r|p  =  xT(B  -  X)"'(B  -  X)"1**  =  xT(B  -  \yl7  £  ||x||  ||(fl  - X)-»r1| 
so 

ll(B   X)     "      IMI       Urn   • 

Therefore  by  choosing  87"22  to  be  a  small  multiple  of 


r-TVR-x^-i* 


we  get 


FrT(B  -  X) 


RellrlKsr^B-X)-1^ 


(4.3) 


fell  1HI2 
\\T\\ 


This  implies  that  we  may  choose  87"  so  that  the  rate  of  change  of  (||P||2- 1)1/2  =  IMI  is  at 
least  ||r||2  /  ||r||.  The  vector  field  given  by  (4.3)  above  is  clearly  smooth  and  integrable,  so  we 
may  let  y(s)  =  ||r(j)||  where  r(s)  is  computed  along  an  integral  curve  of  the  vector  field. 
Thus 
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*7(}      ||r|| 

and  we  may  apply  Lemma  1  to  y(s)  with  m  =  ||7"||_1  and  a  =  2  to  get  the  desired  upper  bound 


y(0)       (|[/>|p-i)1/2 

on  the  distance  to  the  nearest  matrix  with  X  as  a  double  eigenvalue.  Q.E.D. 

We  may  prove  a  similar  theorem  when  we  only  perturb  7"21.  Intuitively  we  would 
expect  such  a  perturbation  to  be  at  least  as  effective  as  one  in  the  B7"22  position  since  it  can 
move  both  the  eigenvalues  of  B  and  X(7"+87"). 

Theorem  5:  If  ||P||>1  then  there  exists  a  87  such  that  7+87  has  a  double  eigenvalue  at 
X(7  +  87)  and 


||sr|| 


2||r| 


(If IP-  D1* 

Proof:  Setting  all  but  8721  to  0  in  (4.2)  yields 


1    _rn  rr(B-X)-18721    -rT(B-\yllT2lrT  -  rT*T2,rT(B-k)-1 

0      0    J  (B-X)-^^!  -(B-\)-*&T2irT 

When  87  is  small  the  square  of  the  norm  of  this  perturbed  projector  is  at  least 


1  +  2Rer7"(fl-X)-18r21  +  ||r|P  +  2||r||Re(rr(B-X)-18721rrFH  +  rrBrMrr(B-X)-,r,) 

=  1  +  ||r|P  +  2Rer^(B-X)-i[(||r[P  +  1)  +  IMPOST,, 
where  r„  =  r/\\r\\.    Now 


||2r'(B-X)-»[(||r|P+  1)+   ||r|PFu^|| 


2IM 


l|B-X||  ||[(||r|P  +  1)+  HrHV^-'ll  ':    n  (4-4) 
where  ||7||F  is  the  Frobenius  norm  (the  reason  for  this  choice  instead  of  the  smaller  ||7||  will 

be      clear      in      a      moment).       Thus      by      choosing      8721      a      small      multiple      of 

[2rr(S-X)->[(||r|P  +  1)  +  ||r|Pv3]*  we  get 


(||/»  +  W|P-  l)"2-  (||P |P-  1)W 


(||p  IP- 1)  ||8r21|| 


2117-1 


Since   ail   quantities   defining   8721    are   analytic   [10]   the  vector  field   defined   by   8721   is 
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integrable.  Furthermore,  it  is  orthogonal  to  T  (in  the  tr7"*8r21  inner  product)  for  all  T. 
Therefore  its  integral  curves  lie  on  spheres  of  constant  \\T\\F.  Thus,  along  an  integral  curve 
the  function  y(s)  =  (||/>|p  -  1)1/2  =  ||r(*)||  satisfies 

d(s)  *  J^i£L 

so  we  may  apply  Lemma  1  with  m  =  l/(2||r||/-)  and  a  =  2.  Note  that  ||7"||f  is  constant  along 
the  curve.  Lemma  1  implies  that  there  is  a  perturbation  8r21  of  2-norm  at  most 
1\\T\\f  '  (ll'>ll2-1)1/2  that  makes  the  eigenvalue  \(T+ST)  double.  Q.E.D. 

Perturbations  in  7"21  were  also  considered  by  Wilkinson  [17,  18]  under  the  name  "fast 

perturbations,"  since  for  many  nearly  defective  matrices  they  can  make  the  eigenvalues  rush 

together  as  fast  as  possible.  In  a  series  of  examples  Wilkinson  showed  one  could  expect  to 

find  a  smallest  perturbation  to  make   "close"   eigenvalues  coalesce  of  size  approximately 

min    1/2  |X-X'|/  (||Pj|  +  ||P^.||),  where  o-(fl)  is  the  spectrum  of  B,  instead  of  \\T\\/  ||P||. 

X'€cr(B) 

The  intuition  behind  Wilkinson's  estimate  is  this:  the  distance  from  X  and  X'  have  to  move  to 
merge  is  |X-X'|  and  their  speeds  are  ||/\||  and  ||Px#ll-  The  factor  1/2  comes  from  their 
acceleration  as  they  approach  one  another.  At  the  end  of  this  section  we  show  that 
Wilkinson's  estimate  is  nearly  a  lower  bound  on  the  norm  of  the  smallest  perturbation  needed 
(although  it  may  occasionally  be  a  gross  underestimate)  and  in  fact  belongs  to  a  family  of 
lower  bounds  including  the  one  in  Theorem  7  below. 

We  can  motivate  Wilkinson's  estimate  from  the  proof  of  Theorem  5.    In  making  the 
estimate  (4.4)  we  used  the  bound 

Tftt-W-lU  >  — 1H1 


\rT(B-\) 


ll»-X|| 
which  although  attainable  is  often  pessimistic.  When  X  is  very  close  to  an  eigenvalue  of  B 

|K(B-X)->||  = 


min    |X-X'| 


is  a  much  better  approximation,  and  leads  to  Wilkinson's  estimate  when  ||/\||>||/V||. 
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We  turn  now  to  computing  eigenvectors.  Let  |||  |||  denote  an  arbitrary  operator  norm. 
From  (4.1)  we  can  see  that  if  T  is  perturbed  by  ST,  P  can  be  perturbed  to  first  order  by  at 
most  HI  bP  HI  =s  2 HI  S  HI  •  HI  P  |||-  HI  87 1||.  A  close  examination  of  (4.2)  shows  this  bound  can  be 
nearly  attained,  so  |||£|||  -|||P|||  may  be  used  as  a  condition  number  for  P.  The  next  theorem 
will  use  Lemma  2  to  show  that  1/(|||5|||  •|||/>|||)  is  a  lower  bound  on  the  distance  to  the  surface 
of  ill-posed  problems.  This  result  is  essentially  identical  to  other  results  in  the  literature  [2, 
14]. 

Theorem   6:   The  distance  in  the   ||||||    norm  from  T  to  the  nearest  matrix  T+bT  where 
\(T+bT)  is  a  multiple  eigenvalue  is  at  least 

1 

HMD -111*111    ' 

Proof:  We  need  to  compute  the  gradient  of  |||S|||  •|||i>||| .    Since  we  are  only  interested  in  an 
upper  bound,  it  will  suffice  to  use  the  first  order  bound 

P+8S||H||P  +  8P|||  -  HI  S  HI  -HI  P|||  *  HI  S  HI-  HI  BP  HI  +  HI  bS  HI  -HI  P|||    . 
Following  Kato  [10]  we  may  compute  to  first  order 

s  +  bs  =  (/-p-8p)(\  +  8\  -  (/-p-8/>)(r+sr))-I(/-p-8P) 

=  S  -  PbTSS  -  SSbTP  -  2(trPST)SS  +  SbTS 
so  that 

III  85 hi  *5|i|p|HM||H||8r|||  • 

Here  we  use  the  fact  that  P  is  of  rank  1  to  bound  |tr87P|s|||87'|||  -|||P||| .    Similarly  from  (4.1) 

1118*111  s2|||S||H||P||H||8r|||    • 
Therefore 

III  S  +  bS  HI-  HI  P  +  BP  HI  -  HI  5  HI -HI  P|||  s7(||M||-|||P|||)H||8r|||    • 
Now  let  y(i)  =  |||S(s)|||  '|||*(*)lll  where  T(s)  is  any  smooth  curve  parameterized  by  arclength 

from  y(0)  =  r  to  T(s0)  where  \(T(s0))   is  a  double  eigenvalue.   Then 

±y(s)  s  7y\s) 

so  we  may  apply  Lemma  2  with  M  =  7  and  p  =  2  to  get  that  s0,  the  distance  in  the  |||  -|||  to  the 
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matrix  T  +  &T  with  \(T  +  ST)  a  double  eigenvalue,  satisfies 


7-5 


Q.E.D. 


By  choosing  a  specific  |||-|||  we  will  see  that  this  result  recovers  a  number  of  lower 
bounds  in  the  literature.  For  example,  if  we  choose  |||X|||  =  ||XXX_1||  where  K  diagonalizes 
T,  we  get  the  lower  bound  of  the  Bauer-Fike  Theorem.  We  can  also  recover  one  of  the  best 
current  lower  bounds  in  the  literature  to  within  a  constant  factor  [2,  14].    Let 


R  = 


1     -rT 
0    HP  ||/ 

It  is  easy  to  see  that  RTR~l  =  diag(X,B),  and  in  fact  the  condition  number 


k(*)=  IIKIMIK-1!!-  11*11+  (IflP-i)1* 

of  R  is  the  minimum  over  all  matrices   which  block  diagonalize  T  [1].  Now  choose 

111*111  =  \)RXR-l\\  ■ 
Then  it  is  easy  to  see  that  the  lower  bound  of  Theorem  6  becomes 


7-     SI 


0  0 

0    (fl-X)-1 


1  . 


1   0 
0   0 


1-1  = 


»»i»(*-*) 


Since  ||X||  a  |||X|||  /  k(P),  we  see  that  a  lower  bound  in  the  ||-||  norm  on  the  distance  from  T 
to  the  nearest  matrix  T+  hT  where  X(r+  87)  is  a  double  eigenvalue  is 


7(||P|I+  (IIP IP- l)1*) 
which  is  within  a  constant  factor  of  the  lower  bound  amin(B-X)  /  (4  ||P||)  on  the  distance  in 

the  \\-\\F  norm  in  the  literature.  (This  lower  bound  may  be  improved  by  at  most  a  factor  of  4 

to  inf  max  (|X-X'|  ,  amiB(B-X'))  /  (||P||  +  (l|P|F-l)1/2)  [2]). 

We  can  recover  the  lower  bound  amin(5-X)  /  (4  ||P||)  by  using  a  quantity  proportional 
to  it  rather  than  |||£|||  |||P|||  as  a  condition  number.  In  fact,  we  can  prove  a  more  general 
version  by  considering  a  group  of  eigenvalues  of  T  rather  than  a  single  one  X,  and  using  a 
condition  number  for  the  projector  associated  with  the  entire  group.  This  condition  number 
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will  be  finite  until  one  of  the  eigenvalues  in  the  group  merges  with  one  outside  the  group;  the 
eigenvalues  within  the  group  may  or  may  not  be  multiple.  Following  Stewart  [14],  we  assume 
T  is  initially  in  Schur  canonical  form 


T  = 


A   C 
0    B 


where  the  eigenvalues  of  A  consist  of  the  group  in  whose  projector  we  are  interested.  We 
exhibit  this  projector  by  solving 


/    -R 

0     / 


MP"]- 


A    0 
0   B 


which  upon  manipulation  yields  the  equation  RB—AR  =  C,  a  nonsingular  system  of  linear 
equations  as  long  as  the  eigenvalue  of  A  and  B  are  disjoint,  which  we  have  assumed.  Then 
the  projector  associated  with  the  eigenvalues  of  A  is 


P  = 


I    -R 

0     0 


The  equation  RB  -AR  =  C  can  be  rewritten  using  Kronecker  products  as 
(fl7®/  -  I®A)colR  =  colC,  where  V0W  =  [V(jW]  and  colC  is  a  column  matrix  made  from 
stacking  the  columns  of  C  atop  one  another  from  left  to  right.  If  we  define 
sep(A,B)  =  crmin(fir®/  -  I®A)  then  clearly  \\R\\F  s  ||C||F  /  sep(A.fl).  The  lower  bound  on 
the  distance  (measured  with  ||||f)  from  T  to  the  nearest  matrix  where  an  eigenvalue  of  A 
merges  with  one  from  B  is  sep(A,B)  /  (4||P||)  [14].  We  prove  this  result  by  using 
||P||  /  sep(A,fl)  as  a  condition  number  and  using  Lemma  2: 

Theorem  7:  Let  T,  A,  B,  and  P  be  as  above.  Then  the  distance  in  the  \\-\\F  norm  from  T  to 
the  nearest  matrix  where  an  eigenvalue  of  A  merges  with  one  in  B  is  at  least 

sep(A,fl) 
4-||P||       • 
Sketch  of  proof:  This  result  requires  some  technical  machinery  found  in  [14].  Briefly,  if  T  is 

perturbed  to  T+&T,  then  to  first  order  \\P\\  and  sep(i4,B)  are  perturbed  as  follows: 
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„.,.,,,      2-H/>|P-|lsr||F 

|8llf>ll|£        scp(A,B) 
and 

|8sep(A,B)|s2-||P|H|8r||F    . 
Combining  these  we  get 

Therefore  along  any  smooth  curve  J(j)  parameterized  by  arclength  and  passing  through 
J(0)  =  7\  the  quantity  k(T(s))  =  ||P(*)||  /  sep(A(j),B(j))  satisfies  D  +  k(T(s))  s  4-k2(7(j))  so 
by  Lemma  2  the  distance  to  the  nearest  T(s)  with  k(T(s))  infinite  is  at  least 
sep(A,B)/(4||P||).  (This  lower  bound  may  be  improved  to  sep„(A,B)  /  (||P||  +  (|l/>||2- l)"2), 
where    sepx(A,B)  =  inf  max  (<xmin(A  -A.')  ,  crmjn(fi-X'))-     This    improved    lower    bound    is 

often  close  to  sep(A,B)  /  (4  ||P||)  but  can  be  much  larger  [2].) 

Now  we  compare  and  evaluate  the  various  bounds  on  the  distance  from  a  matrix  to  the 
nearest  one  with  a  multiple  eigenvalue.  We  assume  for  the  moment  that  we  have  chosen  an 
eigenvalue  X(7")  and  wish  to  compute  the  distance  dx  =  ||87"||  to  the  nearest  matrix  T+hT 
where  X(r+87")  is  a  multiple  eigenvalue;  later  we  will  consider  the  choice  of  X.  Assume 
without  loss  of  generality  that  the  matrix  T  has  norm  ||r||f=  1  and  is  in  the  form 


T  = 


0   Bx 


Let  Px  denote  the  projector  associated  with  X.  Then  the  upper  and  lower  bounds  on  dx  we 
have  discussed  so  far  may  be  summarized  as: 

(IIPJP-1)'"         '•***-*>  *  *  *         4  ||P J|  •  ^ 

The  lower  bound  on  dk  was  improved  by  at  most  a  small  constant  factor  in  [2]  but  is  good 

enough  for  our  purposes.    Since  (||/\||2-1)-1/2  2:  omin(fix-X),  we  see  that  the  lower  bound 

on  dx  can  be  no  smaller  than  the  square  of  the  upper  bound  vmiri(Bx-\)  (recall  that  ||7*||F=1 

so  that  these  are  all  bounds  on  the  relative  distance  and  generally  less  than  1).  This  is  the 
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maximum  gap  between  the  upper  and  lower  bounds.  We  explain  in  section  9  why  we  expect 
such  a  gap  for  algebraic  functions  (such  as  eigenvalues)  in  general.  Examples  [2,9,17,18] 
have  shown  that  either  bound  may  be  a  better  approximation  of  dx;  for  example,  the  lower 
bound  is  nearly  exact  for  2  by  2  matrices. 

By  slightly  modifying  the  proof  of  the  Bauer-Fike  theorem,  we  can  show  Wilkinson's 
estimate  of  dx  is  nearly  a  lower  bound: 


dx  ^   mm r. y. rrt — rr 


(4.6) 


fc-i 


In  fact,  given  any  partitioning  a{B)  =   \J  <t,  of  the  spectrum  of  B  into  disjoint  pieces,  there 

is  a  lower  bound  of  the  form  in  (4.5)  or  (4.6);  these  correspond  to  the  extreme  partitions 
b  =  l,  (Tx=a(B)  in  (4.5)  and  b  =  n,  ffj={X/}  in  (4.6).  These  results  are  stated  in  the  next 
theorem: 

Theorem  8:  Suppose 


T  = 


B 


* 
* 

* 

6-1 


is  block  upper  triangular.  Suppose  B,  has  spectrum  <r,,  dimension  n,  and  associated  projector 
P(,  with  B0  =  [X]  for  consistency.  Let  St  be  an  n  by  n,  matrix  of  orthonormal  columns 
spanning  the  right  invariant  subspace  of  T  belonging  to  o-,,  and  let  5  =  [50|  •  •  •  |S6_,].  Then  S 
block  diagonalizes  T:  S~iTS  =  D  =  diag(D0,  ■  ■  ■  ,Db_l),  where  D,  has  spectrum  a,.  Then 

,     fc       1  •  gmm(Q,-^) 

"x  ^  —   min   -r. — - — - — —    . 

x        2b  o<i<b  max  (||P0||  ,  ||P,.||) 

Proof:  We  claim  that  if  X  =  X(7"+8r)  is  an  eigenvalue  of  the  perturbed  matrix  T  +  bT,  then 

87"  satisfies 
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||5r|1 *  j  mr    \\p,w 


(4.7) 


In  other  words,  A.  must  lie  in  an  inclusion  region  about  one  of  the  a,-,  but  the  size  depends 

only  on  the  sensitivity  of  the  cluster  llPjl  rather  than  the  maximum  sensitivity  maxllPjl  which 

i 

is  the  usual  form  of  the  Bauer-Fike  theorem.    We  prove  (4.7)  as  follows.    It  is  easy  to  see 
that  ||S||  £  VT  [i,  Thm  2]  and  and  that 


5"1  = 


?_1 


(5^    is   «,   by  n)   satisfies   ||S(q1||=  11^/11   [*>  Thm   31-     Assuming  that  A.   is  not  already   an 
eigenvalue  of  T,  we  have 

o  =  det(r+sr-\)  =  det((r-\)(/+(r-\)-16r))  =  det(/+(r-x)_,8r) 

=  det(/+(D-X)-15_18r5) 
implying  1  £  ||(£> - X)-15-18r5l|. 

So  far  this  is  the  usual  proof.  What  changes  is  that  now  we  write 

1  s  IKD-xr'S-iST-SH  £  Vfc-max  \Ypr\y^hTS\\  *  b-\\hT\\  max  ||(Z>,- X)~MI  ||P,|| 
or 


118711  £  *  m,m  — w^r~ 


proving  (4.7). 


From  (4.7)  we  proceed  as  follows.  If  X  is  a  multiple  eigenvalue  of  T+&T,  then  it  must 
lie  within  the  inclusion  regions 

ll5r"  *  *    IH\II 

for  both  i  =  0  and  at  least  one  i>0.  In  other  words 


llSrll  2:  —   min   min  max 

b    0<i<b      \ 


(|x-x|     ^(p,-M 


IKII 
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*  I  o^  max^H  ,  \\P,\\)  T  maX  (|X"^  '  '-CD,-*)) 

(where  we  have  used  [2,  Corollary  4.9]).  This  is  the  desired  result.  Q.E.D. 

When  b=l,  we  can  take  D1  =  Bl,  thus  providing  an  alternate  proof  of  the  lower  bound 
in  (4.5).  Similarly,  when  b  =  n,  we  can  strengthen  the  result  slightly  to  get  (4.6).  No 
particular  partitioning  of  <x(B)  is  always  best;  for  example,  sometimes  the  lower  bound  in 
(4.5)  is  stronger  and  sometimes  (4.6)  is  stronger,  and  both  may  simultaneously  be  gross 
underestimates.    These  results  have  also  been  obtained  by  Wilkinson  [19]. 

If  we  let  d  =  min  dk  be  the  shortest  distance  from  T  to  any  matrix  with  a  multiple 

eigenvalue,  we  see  that  any  upper  bound  on  any  <fx  is  an  upper  bound  on  d  and  that  a  lower 
bound  on  d  is  given  by  the  minimum  of  all  lower  bounds  of  all  dx.  For  example,  (4.5)  leads 
to  the  lower  bound 

mm .  .._   ,, (4.8) 


4  ||PX| 


and  (4.6)  leads  to  the  lower  bound 


.      1  |X-\'| 

mx-  n    ||PJ|+  ||/Yl 


(4.9) 


Even  though  neither  (4.5)  nor  (4.6)  is  uniformly  better  than  the  other,  it  is  easy  to  show  that 
the  bound  in  (4.9)  may  never  be  much  smaller  than  bound  in  (4.8),  but  it  may  be  much 
larger.  Apparently  no  example  has  been  found  where  (4.9)  is  a  poor  estimate  of  d,  although 
there  is  no  proof  that  this  estimate  holds  in  general. 

In  short,  estimating  dx  and  d  explicitly  seem  to  be  quite  difficult  analytical  problems.  In 
practice,  for  a  given  matrix,  the  technique  of  following  a  condition  number  "uphill"  until  it 
reaches  infinity  is  still  a  practical  optimization  method  for  getting  upper  bounds  on  dk  and  d 
even  if  it  is  hard  to  prove  sharper  explicit  bounds  in  general. 
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A  related  problem  to  estimating  d  is  computing  the  distance  to  the  nearest  matrix  with 
an  eigenvalue  of  higher  multiplicity.  Wilkinson  has  constructed  examples  where  even  though 
it  is  possible  to  coalesce  any  two  eigenvalues  with  a  perturbation  of  some  very  small  size,  it 
requires  a  far  larger  perturbation  to  make  all  the  eigenvalues  coalesce  simultaneously  [17, 
18].  Much  work  remains  to  be  done  to  understand  the  geometry  of  the  set  of  matrices  with 
multiple  eigenvalues. 


26 


5.  Polynomial  Zero  Finding 

n 

Let  p{z)  =  2  Piz'  De  a  complex  polynomial  with  a  simple  zero  at  x:  p(x)  =  0.  Let  \\p\\ 
denote  the  Euclidean   norm   of  its  vector  of  coefficients.   If  p   is  perturbed  by  adding  a 

n 

sufficiently  small  polynomial  e(z)  =  X  '(*'»  tnen  to  ^rst  order  p+e  will  have  a  simple  zero 

i  =  0 

atx  +  8x  =  x  -  e(x)/p'(x).  This  follows  from  simply  solving  the  Taylor  expansion 

0  =  (p  +e)(x+  8x)  =  p(x)  +  8xp'(x)  +  «(x)  +  0(||f|P)  =  5xp'(x)  +  e(x)  +  OflHP) 
for  8x.  Thus,  we  may  use  |l/p'(x)|  as  a  condition  number  for  the  zero  x.  In  this  section  we 

will  find  relationships  between  the  reciprocal  of  the  condition  number  |p'(x)|  and  the  distance 

from  p  (measured  using  ||||)  to  the  nearest  polynomial  where  x  merges  into  a  multiple  zero. 

An  n -tuple  eigenvalue  is  infinitely  ill-conditioned  because  a  perturbation  of  p  of  norm  c  can 

cause  a  perturbation  of  x  of  size  €1/n  which  has  an  infinite  derivative  at  €  =  0.  Therefore  we 

may  take  the  set  of  polynomials  with  multiple  zeros  as  our  set  of  ill-posed  problems. 

In  this  section  we  will  show  that  the  distance  to  the  nearest  polynomial  where  x  merges 
into  a  multiple  zero  is  indeed  bounded  by  a  small  multiple  of  the  reciprocal  of  the  condition 
number.  As  a  lower  bound  we  get  a  quantity  proportional  to  the  reciprocal  of  the  condition 
number  squared.  The  zero  x  is  an  algebraic  function  of  the  coefficients  of  p,  and  in  section  9 
we  explain  why  we  expect  a  lower  bound  proportional  to  the  reciprocal  of  the  condition 
number  squared  for  any  algebraic  function. 

The  most  general  previous  result  relating  |p'(x)|  to  the  distance  to  the  nearest 
polynomial  where  x  becomes  double  is  due  to  Hough  [7].    We  need  some  notation:  if  p  is  a 

n 

polynomial  p(z)  =  ^  Piz'  OI  degree  at  most  n,  let  p  =  [p0,  .  .  .  ,p„]r  denote  the  vector  of 

1  =  0 

its  coefficients.  For  any  complex  number  z,  let  z.  =  [l,z,z2,  .  .  .  ,z"]T.  Therefore,  p(z)  =  pjz.- 
Also,  let  z!  a  [0,l,2z,3z2,  .  .  .  ,nzn~x];  thus  p'(z)  =  p_V.  Finally,  let 
Z"  =  [0,0,2,6z,  .  .  .  ,«(n-l)z',-2];thusp"(z)  =  pV- 
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Theorem  9:  [Hough]  Suppose  p  is  a  polynomial  of  degree  at  least  2  and  p(x)  =  0.  Then  the 
smallest  polynomial  e  of  degree  no  greater  than  p  such  that  p  +e  has  a  double  zero  at  x  has 


norm 


M**l s  V2"  |p'(>)|  •  min(l,|xp-") 


a  - 


IklF 


•1 


Sketch  of  Proof:  This  is  an  under  determined  linear  least  squares  problem 


1   x   x2 
0   1   2x 


(X 


X' 

1-1 


nx 


"«ol 

n-1 

en 

= 

0 

which  can  be  solved  explicitly  for  a  solution  of  minimum  norm,  giving  the  claimed  solution. 

Our  approach  proceeds  by  computing  the  gradient  of  |l/p'(x)|  under  changes  in  p, 
subject  to  p(x)  =  0.  We  compute  this  gradient  in  the  next  lemma. 

Lemma  3:  Let  p  be  a  polynomial  of  degree  at  least  1.  Let  De  denote  the  directional 
derivative  of  l/|p'(x)|  (x  a  zero  of  p)  in  the  direction  of  the  polynomial  e,  where  ||e||=l. 
Then 


D.  = 


'         |p'(x)| 


Ret7 


SP"(x) il 


(p'(x))2         P'(x) 


Furthermore, 


Ip'WI2 

Proof:  The  directional  derivative  of  1  /  \p  '(x)|  in  the  direction  of  a  unit  vector  e_  is 
1 


D.  =  lim  — 

'         t-0    c 


|(p+e0'(*  +  8x)|         |p'(x)| 


where  x  +  8x  is  a  zero  of  p  +  te.  From  our  earlier  formula  for  8x  we  have 


D=  lim  — 

t-0     € 


b-w  _  ««(»)p"(x)  +  ee,(x)  +  0(e2)|       Ip'WI 
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=  lim 


1 


•-o    e|p'(x)| 
Noting  that  if  -n  is  a  small  complex  number, 


11  (p'(x))2  P'W 


-    1 


|i+-nl 


=  1  -  Ret,  +  O(hP)    . 


we  see  that 


De        Ip'OOI 


Re 


g OOp "00  _   *'0O 


(P'OO)2 


p'OO 


Ret7" 


g"(x)    _       i' 

(p'OO)2       p'OO 


Ip'OOI 

as  claimed.  We  can  clearly  pick  a  unit  vector  e.  to  make  De  equal  its  upper  bound 


A  = 


|*P"0O 
'  P'0O 

Ip'OOI2 


Q.E.D. 


Applying  this  lemma  for  e  perpendicular  to  £  (i.e.  c(x)  =  0)  yields  the  same  result  as  in 
Theorem  9: 

Theorem  10:  Suppose  p  is  a  polynomial  of  degree  at  least  2  and  p(x)  =  0.  Then  the  smallest 
polynomial  e  of  degree  no  greater  than  p  such  that  p  +e  has  a  double  zero  at  x  has  norm 


e     = 


\P'(*)\ 


HE'  - 


IklF 


■■ii 


Proof:  By  choosing  c  such  that  crx=0  in  Lemma  3  we  get  a  vector  field  in  the  space  of 
polynomials  along  whose  integral  curves  the  zero  x  of  p  will  not  move.  The  unit  vector  e 
which  satisfies  eTx.=  0  and  maximizes  De  is  clearly  the  one  in  the  direction  of  the  vector 
component  of  x.'  orthogonal  to  x_,  or 


eTx!  = 


Ik'  -  7r£"xll  =  «0O 


IWI2 


This  vector  field  is  clearly  continuous,  so  let  ps  be  an  integral  curve  parameterized  by 
arclength.  Let  y(s)  =  |l/p/(x)|.  Then  from  Lemma  3  we  have 
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a* 

so  by  Lemmas  1  and  2  y(j)  has  a  pole  at  s0=  \p  '(x)\  I  n(x),  i.e.  ps  has  a  double  zero  at  x.  To 

see  that  no  closer  polynomial  to  p  has  this  property,  note  that  under  the  constraint  that 
eTi=0,  \De\^  n(x)/|p'2(x)|,  so  by  Lemma  2  |p'(x)|/n(x)  is  the  minimum  distance  to  a 
polynomial  with  a  double  root  at  x  as  well.  Q.E.D. 

Without  much  more  effort,  we  get  a  similar  theorem  with  a  different  constraint  on  e: 

Theorem  11:  Let  p  be  a  polynomial  of  degree  at  least  3  and  p(x)  =  0.  Then  there  is  a 
quadratic  polynomial  e  of  norm  at  most  2|p'(x)|  such  that  p  +  e  has  a  double  zero.  This 
double  zero  corresponds  to  x  in  that  as  e  increases  from  0  to  1,  the  polynomial  p  +te  has  a 
simple  zero  which  moves  from  x  when  e  =  0  until  it  merges  with  another  zero  to  form  a 
double  zero  at  e—1. 

Proof:  The  first  three  components  of 

p'OO 

are 

\p"(x)/p'(x)  ,x-p"(x)/p'(x)  -  l,x2p"(x)/p'(x)  -  2x]=  [y  ,xy-l  ,x2y-2x]  (5.1) 
For  any  x  or  y,  one  of  the  components  of  this  last  vector  has  to  have  absolute  value  at  least 

1/2.  To  show  this,  assume  to  the  contrary  that  all  three  component  are  smaller  than  1/2.  Then 

|xy-l|<l/2  implies  |xy|>l/2  or  |x|>l/(2|y|)>l.  Thus  |x2y-2x|=|x|-|xy-2|>l-|xy-l|>l/2,  a 

contradiction.    Therefore  by  choosing  e  a  unit  vector  pointing  the  same  direction  as  the 

vector  in  (5.1),  we  get 


'       2|p'(x)P 
The  vector  field  defined  by  this  choice  of  e  is  smooth  since  the  components  in  (5.1)  are 

smooth,    so    let   ps    be    an    integral    curve    with    zero   xs,    where    s    is    arclength.    Define 
y(s)  =  \l/Ps(xs)\.Thns 
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-y(s)  *   If- 

so  by  Lemma  1  there  is  a  polynomial  ps   with  a  double  zero  with  s0=\]p-ps  ||  s  2[p'(x)|. 

Q.E.D. 

Just   as   we   used   Lemma   1   to   derive   an   upper   bound   on  the   distance  to   nearest 
polynomial  with  a  multiple  zero,  we  will  use  Lemma  2  to  derive  a  lower  bound. 

Theorem  12:  Let  p  be  a  polynomial  of  degree  n^2.  Let  ps  be  a  continuous  map  from 
j€[0,50]  to  the  space  of  polynomials  of  degree  no  greater  than  n,  such  that  p0  =  p ,  and  such 
that  s  is  the  arclength  parameter.  Let  x3  be  a  zero  of  ps  such  that  xs  is  a  continuous  function 
of  s  and  x0=x.  Then  if  xs  is  a  multiple  zero  of  ps  , 

''•"        4*3  •    max  (||pj|,||p,||  |x,p-2) 
Proof:  From  Lemma  3  we  know  that  if  ||e||=  1  then 
in  I  as  a  =    llg>"(*)-*V(x)|| 

Ip'MP 

=    \\spts."  -  S.'PTS.'\\ 

Ip'OOP 

=   lbc"rP  -xVrp|| 

Ip'MP 
Ip'WP 

where  ||-||  is  the  2-norm  of  a  matrix.  The  matrix  X2."T  -  i'i'r  is  an  n  +  1  by  n  +  1  matrix 
whose  norm  we  may  bound  simply  by 

Hii"7"  -  i'i'T||  ^  ||i||  •  ||i"r||  +  lli'll2  =  2n3  max  (l.lxP""2)    • 
This  implies 

^  ,        2n3max(l,lxP"-2)  ||p|| 
|p'(x)P 
Now  consider  the  function  y(s)  =  l/|p/(x,)|.   We  have  just  shown 
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■f  y(s)  <  2n3  maxCl.^l2"-2)  ||p,||  •  y\s) 
as 


ds 
so  that  by  Lemma  2  y(s)  remains  finite  for 


S   <   Jn  = 


4„3  •    max  (IIpJUIpJI  kj2""2) 

OSlSl. 


as  claimed.  Q.E.D. 

At  first  glance  it  would  seem  hard  to  apply  Theorem  12  since  s0  is  defined  in  terms  of 
itself.  In  practice,  however,  one  would  apply  the  theorem  when  it  is  possible  to  make  x  a 
multiple  zero  by  only  a  small  perturbation  in  p.  Thus,  xs  should  not  vary  much  from  x  nor 
should  \\p  ||  vary  much  from  \\p  ||.  In  such  cases  an  approximate  lower  bound  is  thus  provided 
by  the  expression 

\P'(*)\2 


4n3  •  ||p||  •  max(l,tx|2"~2) 
Alternatively,  if  we  scale  p   appropriately,  we  can  effectively  estimate  both  ||pj|  and 

\xs\.  For  example,  if  we  assume  that  the  coefficients  p,  of  p  are  no  larger  than  the  leading 

coefficient  p„  divided  by  n:   |p,|s|p„|  /  "   for  i<n   (which  can  be  achieved  by  the  simple 

change  of  variable  x->ax  for  appropriate  a),  we  get  the  following  bounds: 

Corollary  1:  If  p  is  a  polynomial  of  degree  n  where  the  coefficients  p,  satisfy  |p,|s  \pn\  I  n  for 
i<n,  and  if  z  is  a  simple  zero  of  p,  then  the  distance  from  p  to  the  nearest  polynomial  q 
where  z  coalesces  into  a  multiple  root  (in  the  sense  of  Theorem  12)  is  bounded  below  by 

1^1,  ^(Jkl,  \Z^I )  a  miD  (  il£i  ,   -0235  •  ^'(z)P 

llP     *"  ^    «2        (5*2  +  9  (3/8)1/2)  *3  ||p  ||  «2  "3IHI 

Proof:  Assume  -n  =  ||p-9l|s  IIpII/"2;  otherwise  the  theorem  holds  trivially.  Since 
\Pi\  £  |PnUn>  it  *s  easy  to  see  any  zero  z  °f  P  satisfies  |z|sl.  Since  \]p—  q\\=i\,  if  ps  is  any 
polynomial  on  the  line  segment  connecting  p  and  q,  \\p  —ps\\  £  -n,  so  the  coefficients  psi  of  ps 
satisfy  |pjn|  ^  |p„j—  ti  and  \psi\  :£  |p,|+t).  If  z'  is  a  zero  of  ps,  it  is  easy  to  see  z'  /  a  is  a  zero 
of  the  polynomial  p~s(x)  =  ps(a  x)  whose  coefficients  psi  satisfy  |p"ri|  =  |a'  psi\.  If  we  choose 

a  =  (IpJ  +  »*n)  /  CIp»I  -  v) then 
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|pj    _    1     \Pi\  +  ■»!    _    1     IPnl  /  "  +  1    .     J_ 

n 


\Psn\    '       a       IP«I~1  °  IP»I_1» 

so  |z'  /a|  s  1  or  |z'|  :£  a.   Thus  by  Theorem  12  ti  must  satisfy 


■n 


\p'(z)\ 


4n3(lbll  +  -n)  max(l  ,  a2""2) 


IP'Mf 


4«3(1+   -V)  IIP  II 


1  + 

» 1 
kl 

1  - 

Ip.I  J 

2n-2 


which  is  true  only  if 


4n'(l+   -±-)||p| 
nj 

Ip'(OP 

since 


|P»I  P»l 


which,  since  lip  II  s  |pj  (1  +  1/«)1/2 

■n  /  IPnl  s  IIpIU  ("2  IpbI)  £  (n_2  +  n~3)1/2  s  (n-1)"1,  is  in  turn  true  only  if 


and 


1  \l/2 


4n3(l  +  -7)  IIpII                     ,                                (2n-2)i 
-  11  (l  +  -i-)2c-')  s  1 


•(,\U 


Ip'WI 


n-l 


IIP  II 


or 


Now 


4n3  (1  +  n~2)  e2    +    (2n-2)  (1  +  n"1)1^ 

Ip'COI2  IIpII 


Ip'WI  *  «  kl  +  2  J^r- *  Ip.I  ^t1  *  IIpII  ^ 


;'<" 


so  ti  is  in  turn  bounded  below  by 


Ip'WP 


llpll  (4n3  (1  +  „"2)  ,2  +    (n-lHBn-lJMl  +  n-T2) 

Ip'WP  a    .023S|p'(z)P 

|H  n3  (5  *2  +  9  (3/8)1*)  n3lbll 


as  claimed.  Q.E.D. 
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6.  Pole  Assignment 

The  pole  assignment  problem  is  defined  as  follows:  given  an  n  by  n  matrix  A ,  an  n  by  m 
matrix  B ,  and  a  set  {X,}  of  n  complex  numbers,  find  an  m  by  n  feedback  matrix  F  such  that 
A+BF  has  eigenvalues  {A.,}.  The  motivation  for  this  problem  is  the  following:  given  a  control 
system  x  =  Ax  +  Bu,  choose  the  control  input  u  as  a  function  Fx  of  x  (feedback)  to  make  the 
matrix  of  the  controlled  system  x  =  (A  +  BF)x  have  a  specified  spectrum.  It  is  well  known  that 
this  problem  has  a  solution  for  arbitrary  {A.,}  if  and  only  if  the  pair  (A,B)  is  controllable,  i.e. 
[B|AB|A2B|  •  •  •  lA""^]  has  full  rank  n  [20].  If  the  pair  (A,B)  is  not  controllable,  then  some 
eigenvalues  (called  the  uncontrollable  modes)  of  A+BF  will  be  independent  of  F  (and  be 
eigenvalues  of  A);  the  remaining  eigenvalues  can  be  set  arbitrarily  by  choosing  F. 

The  robust  pole  assignment  problem,  as  defined  in  [11],  is  to  find  F  subject  to  the 
additional  condition  that  X,  the  eigenvector  matrix  of  A  +  BF  =  XAX-1,  be  as  well 
conditioned  as  possible  (here  A  =  diag(A,)).  The  condition  number  k(X)  =  ||X||-|[X_I||  (in 
this  section  ||||  denotes  the  2-norm  and  ||||f  the  Frobenius  norm)  of  the  best  conditioned  X 
turns  out  to  measure  the  size  the  sensitivity  of  both  F  and  the  time  dependent  solution  of  the 
control  system  x  =  (A  +  BF)x.  For  example,  if  the  A,  are  distinct,  then  |JF||  will  get  larger  as 
k(X)  gets  larger  (see  [11]  for  details).  Therefore,  we  shall  take  k(X)  as  our  condition  number 
for  the  robust  pole  assignment  problem. 

We  will  also  use  a  slightly  different  measure  of  distance  than  used  before: 

dist((A0,B0),(A^1))  -       ■   ,fA°7,Alt  M  ,   +       ■   Z^'t  „  ;    •  (6-1) 

mm(||A0||f  ,  \\AjWf)         minO^I^  ,  ||fi,||f) 

This  distance  has  the  advantage  of  being  insensitive  to  the  scaling  of  A  and  B . 

As  in  previous  sections,  we  are  interested  in  relating  k(X)  to  the  distance  from  (A,B)  to 
the  nearest  ill-posed  pair  (for  which  no  X  exists).  How  do  we  characterize  the  set  S  of 
problems  (A,B)  (for  fixed  {A,})  which  are  ill-posed?  From  the  above  discussion,  it  is  clear 
that  it  includes  all  uncontrollable  (A,B)  where  {A,}  does  not  include  the  uncontrollable  modes 
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(e.g.  if  no  X,  is  an  eigenvalue  of  A). 

Using  Lemma  2,  we  will  prove  a  theorem  which  gives  a  lower  bound  on  the  distance  to 
the  set  S  of  ill-posed  problems  in  terms  of  the  reciprocal  of  the  condition  number  k(X). 
Before  doing  so  we  state  the  following  lemma  from  [11]: 

Lemma  4:  [Kautsky, Nichols, Van  Dooren]   Let 


B  =  [l/0,I/,]- 

where  U=[U0,UX\  is  orthogonal  and  Z  is  of  full  rank.  Let  X,  =  N(C/[(A-X,/)),  where  N() 
denotes  the  null  space.  Then  the  i-th  column  x,  of  X  (the  right  eigenvector  of  A  +  BF  for  the 
eigenvalue  X,)  satisfies  x,€X,. 

Proof:  Premultiply  the  equation  A+BF  =  XAX~1  by  UT  and  rearrange  to  get 


or,  taking  the  last  n-rank(B)  rows 


F  = 


(AX  -  XA)X" 


0  =  t/[(AX  -  XA) 


Q.E.D. 


Let  Xj  be  a  matrix  of  orthonormal  columns  spanning  X,,  and  let  S  -  [Xj  •  •  •  |X„].  If  B 
is  of  full  rank  and  \(  is  not  an  eigenvalue  of  A,  X,  will  be  n  by  m.  Lemma  4  says  that  the 
eigenvector  matrix  X  can  be  written  as 


X  =  S 


0 


0        u, 


(6.2) 


where  ut  is  a  column  vector  of  the  same  dimension  as  X,.  Since  it  is  hard  to  characterize  the 
condition  number  of  X,  we  instead  use  the  following  lower  bound  based  on  S: 
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Lemma  5:  Let  X  and  S  be  defined  as  above.  Then 

k(x)  *  «r-£(s)  =  IK***)-1!!"2  =  IKSx^.O-MI1^  • 

1=1 

Proof:  Assume  without  loss  of  generality  that  ||X||=1.  This  clearly  implies  that  ||u,||sl  in 
(6.2).  Now  let  a  be  the  smallest  singular  value  of  S  and  vr  the  corresponding  left  singular 
vector,  i.e.  ||vr||=l  and  \\vTS\\  =  a.  Then  it  follows  simply  that  ||vrX||  s  a.  Thus 

kw  *  a-i(5)  =  ii(55*)-i|r  =  iid^rr'ir 

i=i 
as  claimed.  Q.E.D. 

It  is  to  (T~in(5)  that  we  will  apply  Lemma  2  to  get  a  lower  bound  on  the  distance  to  the 
nearest  ill-posed  problem. 

Lemma  6:  Let  a~^D(S(A,B))  be  the  value  of  ct~4(S)  for  the  S  defined  by  A  and  B.  Let  S'  be 
the  set  of  problems  (A,fl)  for  which  cTmin(S(A,B))  =  0.   Let  dist((A,fl),S)  be  defined  as 

dist((A,fl),S')  =      inf      dist((A,fl),(A,,fl,))    . 
Assume  B  is  of  full  rank  and  that  none  of  the  X,  is  an  eigenvalue  of  A  so  that 

*B  =    \\B\\F  '"mini*) 

and 

ka  m  max  (||A-X,||f -||(A  -X,-)"1!!  ,  l|A|HlCA-*l)_1ID 
i 

are  well  defined.  Then 

dist((A,*),S')  s  -7= -1&— . 

V»     kbka-*-US(A,B)) 

Proof:  Consider  perturbations  A  +  8A  of  A  and  B  +  85  of  B .  We  need  to  estimate  for  small  8A 

and  6fi 

%,^(S)  "  <^(S8)  =   \\(SS*rl\\  -  HAW)"1!!  =  8||(55*)-MI 
where  S  =  S(A,B)   and  .S8=5'(A +  8A,B  +  8i9).     Let  Xt   denote   the   matrices  of  orthonormal 

columns  comprising  5=[X,|  •  •  •  [X„]  and  let  o-  be  the  smallest  singular  value  of  S.    If  the 
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perturbations      8A      and      85      yield      perturbations      8X,      in      X,,      then      S      becomes 
[X,  +  8X,|  •  •  •  |X„+8X„]  and  SS*  becomes  (to  first  order) 

SS*  +  2  bXpT+XfiXj  =  SS*  +  [8XJ  •  •  •  |8X„]S*  +  Si&Xj  ■  ■  ■  |8X„]r 

1=1 

and  (SS*)'1  becomes  (again  to  first  order) 

(SS*)-'  -  (SS*)-'  ([8X,|  •  •  •  |BX„]S*  +  S[8X,|  ■  •  ■  |8X„f)  (SS*)'1    . 
What  we  need  to  estimate,  then,  is 

11(55*)"'  ([8X,|  •  •  •  |8X„]S*  +  S[8X,|  •  •  •  IBXJ^SS*)-1!!   . 
Now  it  is  easy  to  see  that  ||5*(55*)_1||  =  ct_1  so 

11(55*)-'  ([8XJ  •  •  •  |8X„]5*  +  5[8X,|  •  •  •  |8XB]r(55*)-I||  s  2  a"3  •  ||[8X,|  •  •  •  |8X„]|| 

s  2  V^  a"3  •  max  ||8X,||  .  (6.3) 
To  estimate  ||BX,||  note  that  X,  =  N(t/[(A-X,))  =  R  ((A  -\,)T  U^,  where  R()  denotes  the 
column  space  of  (•).  Now  let  Y  be  an  n  by  n  —  m  matrix  of  orthonormal  columns  and  Y±  an  n 
by  m  matrix  of  orthonormal  columns  orthogonal  to  the  columns  of  Y .  Then  if  Y  is  perturbed 
to    y+8y    (8F    in    an    arbitrary    direction),    it   is    easy    to   verify    that    yJ-    is   perturbed   to 

yi  -  y8yryl. 

In  our  case  we  want  Y  to  span  R((A-X,)rt/,)  so  we  may  take 

y  =  (A-\,)TUx(U\(A-\)(A-\t)TU\)-^    . 
X,  can  be  taken  as  Y±  and  so 

8y  =  (SArl/,  +  (A-\l)TbUl)(Ul(A-\i)(A-\i)TUl)-V2  +  (A-XjUfi 
where  \\D\\  is  on  the  order  of  ||8A||  and  ||8B||  and  where  the  result  of  perturbing  B  to  fi  +  8fl 

is  to  perturb  Ux  to  Ux  +  hUv  Thus 

8X,  =   -Y(hATUx  +   (A-X1)r8f/1)(l/[(A-X|)(A-X1)7-l/[)-1/2X1. 


so 


||8X,||  *  (||8A||  +  ||A-X1||-||8t/1.||)||(t/[(A-X1.)(A-X1)rl/[)-^|| 
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*(||BA|I+  ||A-X,|H|8l/f||)||(/l-X,.)-i||    . 
We  estimate   ||8£/,||  similarly.   Let  Y  =  B(B*B)~}/2  be  a  set  of  orthonormal  columns 
spanning  the  range  of  B.  We  may  take  U0  as  Y.  Perturbing  B  to  B  +  hB  makes  an  equivalent 
change  of  hY  =  hB{B*B)~xa  +  BD  in  Y,  where  |[D||  is  on  the  order  of  ||8fi||,  so 

11817,11=  ||-y8yryl||£JlMJL    . 

amto(fi) 

Putting  these  estimates  together  yields 

\\bX,\\£  ||SA|H|(/l-X1.)-1||+  ||6fi||-a-ii1(B)-K(A-\1.) 
and  substituting  into  (6.3)  we  get  the  following  bound  on  the  change  in  the  ||(55*)_1||: 

8IK55*)-1!!  ^  2  V^  a-3  •  max  (||8A  ||-||(A-X,)-i||  +  ||8*||-a-i(B)-K(A-X|.)) 

£2V"a    (ire" +  "int) ■ "- ■  "•  ■  (6-4) 

We  are  now  prepared  to  apply  Lemma  2.    Let  (A(s),B(s))  be  any  smooth  curve  from 
(A(0),B(0))  =  (A,B)  to  (A(j0),fl(i0))€S'  with  the  following  properties: 

(1)     It  is  parameterized  by  arclength  in  the  sense  that 

\\j;Ms)\\F        \\j;B(s)\\F 


IIAWIU  \\B(s)\\F 

(2)  A(j)  is  the  shortest  smooth  path  from  A(0)  to  A(s0)  such  that  ||A(i)||f  lies  between 
min(||A(0)||F  ,  11^(^)1^)  and  max(||A(0)||F  ,  ||A(j0)||f)  for  all  Osjsj0.  It  is  easy  to  see 
that  this  assumption  implies  that  the  length  of  the  curve  A(s)  (measured  using  ||||f)  is 
no  more  than  -ir/2  times  as  long  as  the  straight  line  distance  ||A(0)-A(j0)||f.  We  make 
the  analogous  assumption  about  B(s). 

Lety(j)  =  0-?B(S(A(s),B(s))).  Then  from  (6.4)  we  see 
so  by  applying  Lemma  2  with  p  =  3/2  the  "distance"  s0  to  set  the  S'  is  at  least 
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Vn    max    (kA{s)kBU))  y1/2(0)         Vn    max    (kA(j)kB(j))  a    ' 


OSjSi 


Osjsi. 


We  relate  j0  to  the  relative  distance  (6.1)  as  follows: 


*o  =  / 


\j;Ms)\\r         \\j;B(s)\\F 


llAWlt  ll»Wlt 


rfi 


/||-f  a(*)M*      Jllx*Wlt* 


<fj 


dj 


min    ||A(*)||f  min    ||fl(j)||f 


OsjSs 


OSjSi. 


f  ||A(0)  -  A(Jo)||F  |-  ||fl (0)  -  B(s0)\\F 


min(||A(0)||f  ,  \\A(s0)\\F         min(||fl(0)||f  ,  ||B(*0)||F 
Therefore  2 -s^tt  is  a  lower  bound  on  the  distance  measure  (6.1). 


It  remains  to  estimate  the  maximum  of  *A(J)Kfi(i)  in  the  denominator. 


Note  that  if 


HACQ^AJj,  ;  \\B(s)-B\\F 

min(\\A\\F  ,  \\A(s0)\\F)   "    min(||S||f  ,  ||fi(,0)||f)   ="  ^ 


then 


and 


ill  —  i 


||ftA||,  -  \\A(s)-A\\F  *  „  ||A||j,  ^  t,  ka  IKA-X,.)"1!! 


|SB||p  -  \\B(s)-B\\F  *  -n  \\B\\F  =  t,  kb  omiD(B)    . 


Therefore 


kA(s)  =  mix(||A(.)||F.||(A(*)-XJ)-»||.||A(*)-Xl|HlCAW-M"ll 

«*  imii^-ika-x,.)-'!! 


^  max  ( 


k,  • 


i-  IKA-^r'INMIU      l-  iKA-^rMHlBAlb, 

1    -    1    KA 


) 


Similarly 


(6.5) 
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\\B(s)\\F  \\B\\r+  \\SB\\F  1  +  ^1K£ 

Thus  (6.5)  implies 

(1  +  r\  max(KA,Kj))2 


Ki/rt  Kn/ri  s   K>i   *; 


(1  -  T)  maxl^.Kj,))2 
Thus,  dist((A,fl),S')  can  be  less  than  t\  only  if 


*1 


^/-  _,   (1  +  1  max(KA,KB))2 

it  vn  ka  kb  a       - 

(1  -  H  max(KA,KB))2 


or,  rearranging 


(y  V^  Ka  Kfi  o-    i  -n)-(l  +  max(icA,icB)  t\)2  a  (1  -  max(icA,KS)  i))2   .  (6.6) 

Since  a  s  ||S||  £  V^t  ka2=1  and  kb&1 


y  Vn  ka  kb  a    '  a:  max(KA,Kfl) 


Thus  (6.6)  is  true  only  if 


(|-  V^  K/1  Kfi  a-i  „).(!  tlV^^K,  a"1  „)2  a  (1  -  |-  V^  ka  kb  a"1  t,)2  .  (6.7) 
Letting  x  =  —■  Vn  ka  kb  a-1  i),  we  see  (6.7)  is  equivalent  to  x(l  +  x)2  a  (1-x)2,  or 
x3+x2+3x-l  2=  0,  which  is  only  true  if  x  a  .295.  Thus 

.295  _  .187 


dist((A,B),S') 


■|-  V^  Ka  Kb  a-i         Vn  ka  kb  a    » 
as  was  to  be  proved.  Q.E.D. 

Combining  this  with  Lemma  5  yields 

Theorem  13:  Let  S  be  the  set  of  (A,Z?)  where  no  nonsingular  matrix  X  of  eigenvectors  exists. 
Let  B  of  full  rank,  no  X,  be  an  eigenvalue  of  A,  and  ka  and  kb  be  defined  as  in  Lemma  6. 
Then 

dist((A,B),S)  *     r    -187  . 

Vn  Ka  Kb  K(X) 

Let  U  be  the  set  of  uncontrollable  pairs  (A,B).  Then  we  also  have 
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dist((A,fl),U)  *  -r    -187    ^^    • 

Proof:  From  Lemma  5  we  have  k(X)  s  <r~i,(£),  so  the  first  claim  follows  immediately  from 
Lemma  6.  The  second  claim  follows  since  the  set  of  uncontrollable  problems  U  is  contained 
inS.  Q.E.D. 

We  can  also  write  the  second  inequality  of  Theorem  13  as 

K(X)   ==    —7= , 

Vn  ka  kb  dist((A,B),U) 
implying  that  the  closer  (A,B)  is  to  being  uncontrollable,  the  larger  the  condition  number 

k(X)  of  the  problem.  Note  that  the  factors  ka  and  kb,  both  at  least  1,  tend  to  make  the  lower 

bound  on  k(X)  smaller.  The  reason  for  this  is  as  follows.    If  k(B)  is  large,  a  very  small 

perturbation  of  B  can  change  the  space  R(fl)  spanned  by  its  columns  greatly,  in  particular  in 

such  a  way  that  the  pole  assignment  problem  becomes  quite  easy.  Therefore  we  cannot 

guarantee  that  k(X)  will  be  bad  in  this  case.  Similarly,  if  k(A)  is  large,   some  \,  is  nearly  an 

eigenvalue  of  A.    Thus,  even  if  (A,B)  is  nearly  uncontrollable,  only  a  small  perturbation  may 

be  needed  to  put  a  pole  at  X,.  In  the  extreme  case  when  {X,}  is  the  spectrum  of  A,  F  =  0  solves 

the  pole  assignment  problem  even  if  (A,B)  is  exactly  uncontrollable  and  so  k(X)  depends 

only  on  how  hard  it  is  to  diagonalize  A .  A  similar  result  to  Theorem  13  was  proven  in  [3] 

using  more  ad  hoc  techniques. 
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7.  Interpretation  of  the  Differential  Inequalities 

In  this  section  we  provide  a  numerical  interpretation  of  the  differential  inequalities 

m-K2s  ||Dk||s  Af-K2  (7.1) 

stated  in  the  introduction.  We  will  use  the  relative  condition  number 

K   (ex)=  UBifeUfcfeU 

rtAg'  '  HsOOII  ' 

this  formula  holding  only  if  g  is  differentiable. 

As  in  the  introduction,  it  is  easy  to  see  that  by  multiplying  inequalities  (7.1)  by 
||x||/  k(x),  we  get 

ibk(x)  £  k„,(k,x)  s  M -k(x)     if||x||=l    •  (7.2) 

Inequalities  (7.2)  are  equivalent  to  the  statement:  solving  the  problem  x,  normalized  so  ||x||=l, 

is  essentially  just  as  hard  (within  factors  m  and  M)  as  computing  the  condition  number  k  of  the 

problem  x. 

If  we  further  assume  that  the  condition  number  tc  of  equation  (1)  is  homogeneous  of 
degree  k,  i.e.  ic(ax)  =  a*  k(x)  for  any  real  positive  a,  then  for  k  =  0  or  k=-1  we  will  show 
that  (7.2)  is  essentially  equivalent  to  (7.1).  All  the  condition  numbers  k  considered  in  this 
paper  are  homogeneous,  either  with  k=-\  (for  matrix  inversion,  eigenvectors,  and 
polynomial  zeros)  or  Jfc  =  0  (for  eigenvalues  and  pole  placement),  k  is  homogeneous  in  these 
cases  because  the  problems  themselves  (i.e.  the  mapping  from  problem  to  solution)  whose 
conditioning  k  measures  are  homogeneous. 

The  main  point  of  this  paper  has  been  that  if  (7.1),  or  equivalently  (7.2),  holds,  then 
the  distance  d  from  the  problem  x  to  the  nearest  point  in  the  set  of  ill-posed  problems  P  is 
bounded  by 

h~T  s  d  £  — Vt  C7-3) 

Mk(x)  mK(x) 

(how  d  is  measured  depends  on  the  norm  ||||  and  whether  the  degree  k  of  homogeneity  is  -1 
or  0).    Conversely,  we  will  see  that  if  we  define  k(x)  to  be  Vd  then  this  k(x)  satisfies  the 
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differential  inequalities  (7.1)  and  (7.2)  with  m  =  M  =  1. 

The  near  equivalence  of  (7.1),  (7.2)  and  (7.3)  is  very  satisfying,  because  it  says  that  if 
the  condition  number  k  has  the  utterly  reasonable  property  of  being  just  as  hard  to  compute 
as  the  solution  x  itself,  then  it  has  the  attractive  geometric  property  of  being  1  over  the 
distance  to  the  nearest  infinitely  ill-conditioned  problem.  Indeed,  the  common  formulas  for 
relative  condition  numbers  (e.g.  ||i4||||A_1||  for  matrix  inversion)  lead  one  to  believe  that  one 
must  solve  the  problem  (e.g.  compute  A-1)  to  within  reasonable  accuracy  to  get  a  reasonably 
accurate  condition  number.  This  intuition  is  corroborated  by  these  theorems. 

To  state  our  results,  we  will  need  two  measures  of  distance.  If  ||||2  is  Euclidean 
distance,  define 

dist2(x,/>)  ■  inf  ||x-y||2   , 

yiP 

where  P  is  the  set  of  ill-posed  problems.  Assume  the  set  P  is  homogeneous,  i.e.  x€P  implies 
axdP  for  all  scalars  a.  This  will  be  true  if  k  is  homogeneous.  Let  dGC(a,b)  denote  shortest 
distance  along  a  great  circle  between  two  points  a  and  b  on  the  unit  sphere.  Define  the 
"great  circle"  distance  between  x  and  P 

dis,-c"',)-;?f-(ii^'wr>- 

For  the  first  theorem  we  assume  that  k(x)  is  homogeneous  of  degree  -1.  This  is  the 
case  for  k(A)  =  ||A_1||  (matrix  inversion),  k(A)  =  ||5||*||P||,  S  a  reduced  resolvent  and  P  a 
projector  (eigenvectors),  k(T)  =  ||/»||  /  sep(A,B)  (eigenvectors),  and  k(p)  =  l/|p'(x)|,  p  a 
polynomial  with  zero  x  (polynomial  zero  finding).  Then  we  have 

Theorem  14:  Let  a  problem  x  have  condition  number  k(x)>0,  where  k  is  homogeneous  of 
degree  -1  and  has  a  continuous  Fr£chet  derivative  Dk(x)  almost  everywhere  it  is  finite.  Let 
P  denote  the  set  of  x  where  k  is  infinite.  Then  (7.4a)  and  (7.4b)  below  are  equivalent 
wherever  Dk  exists: 

3  0  <  m  <=;  M   such  that  m-K2(x)  s  ||Dk(x)||  s  Mk2(x)  (7.4a) 
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3  0<mSM    such  that   m-K(x)  £  k„,(k,x)  s  Mk(x)   for  all    ||x||=l    .        (7.4b) 
In  particular,   when  Dk   is  continuously  differentiable  and   ||-||=  ||-||2,  the  following  three 

conditions  are  equivalent: 

k2(x)  =  ||Dk(x)||2   ,  (7.5a) 

k(x)  =  k„,(k,x)   for  all    ||x|L=l    ,  (7.5b) 

io  =  dist2(x'p)  •  <75c> 

Proof:  (7.4a)  can  be  converted  into  (7.4b)  by  multiplying  through  by  \\x\\  I  k(x)  and  taking 
||x||=l.  Given  (7.4b),  (7.4a)  can  be  derived  by  substituting  x/||x||  for  x,  yielding  inequalities 
true  for  all  nonzero  x,  and  using  the  fact  that  if  k  is  homogeneous  of  degree  —  1,  Dk(x)  is 
homogeneous  of  degree  -2.  The  equivalence  of  (7.5a)  and  (7.5b)  follows  from  taking 
m  =  M=l  in  (7.4a)  and  (7.4b).  To  show  they  imply  (7.5c),  use  the  arguments  following 
Lemmas  1  and  2  in  section  2.  To  show  (7.5c)  implies  (7.5a)  and  (7.5b),  just  differentiate. 
Q.E.D. 

For  the  second  theorem  assume  that  k(x)  is  homogeneous  of  degree  0.  This  is  the  case 
for  k(A)  =  (||/»||2-1)1/2,  P  a  projector  (eigenvalues),  and  k(A,A,A)  =  ||X|H|X-1||,  (A,B)  a 
control  system  to  be  assigned  the  poles  A  via  state  feedback  F:  A  +  BF  =  XA.X~l. 

Theorem  15:  Let  a  problem  x  have  condition  number  k(x)>0,  where  k  is  homogeneous  of 
degree  0  and  has  a  continuous  Fre"chet  derivative  Dk(i)  almost  everywhere  it  is  finite.  Let  P 
denote  the  set  of  x  where  k  is  infinite.  Then  (7.6a)  and  (7.6b)  below  are  equivalent 
wherever  Dk  exists: 

3  0<msM   such  that   ot-k2(x)  s  ||Dk(x)||  s  Mk2(x)   for  all    ||x||=l   ,      (7.6a) 

3  0  <  m  s  M    such  that  m-ic(x)  s  Kre/(K,x)  £  M  k(x)    .  (7.6b) 

In    particular,    if   Dk    is   continuously   differentiable   and    ||||  =  ||||2,    the   following   three 

conditions  are  equivalent: 

k2(x)  =   ||Dk(x)||2   for  all    IM^l    ,  (7-7a) 
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k(x)  =  *.„,(*., x)    ,  (7.7b) 


TOO  =  distcc(  W|,)  '  (7-7c) 


Proof:  (7.6a)  can  be  converted  into  (7.6b)  by  substituting  x/||x||  for  x,  yielding  inequalities 
true  for  all  x#0,  and  using  the  fact  that  if  k  is  homogeneous  of  degree  0,  ||Dk||  is 
homogeneous  of  degree  -1.  Given  (7.6b),  (7.6a)  can  be  derived  by  multiplying  through  by 
k(x)  and  taking  ||x||=l.  The  equivalence  of  (7.7a)  and  (7.7b)  follows  from  setting  m  =  M  =  1 
in  (7.6a)  and  (7.6b).  To  derive  (7.7c)  from  (7.7ab)  the  argument  is  similar  to  that  of  the  last 
theorem.  The  only  difference  is  that  since  k  is  homogeneous  of  degree  0,  Die  is  orthogonal 
to  x  by  Euler's  theorem  for  homogeneous  functions.  Therefore  integrating  the  vector  field 
defined  by  Dk  yields  a  curve  lying  on  a  sphere  of  constant  radius.  This  is  why  we  deal  with 
shortest  paths  along  great  circles  between  points  of  unit  norm.  To  show  (7.7c)  implies  (7.7a) 
and  (7.7b),  just  differentiate.  Q.E.D. 
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8.  Connections  with  Newton's  Method 

In  this  section  we  show  that  in  case  the  function  /  which  maps  problems  to  solutions  is 
scalar,  the  differential  inequality  (1.1)  underlying  our  approach  is  nothing  more  than  a 
restatement  of  Newton's  method.  Thus  inequality  (1.1),  far  from  holding  only  coincidentally 
for  the  special  problems  considered  here,  actually  holds  locally  (in  a  sufficiently  small 
neighborhood  of  the  set  of  ill-posed  problems)  for  a  quite  general  class  of  problems.  When  / 
maps  between  spaces  of  the  same  dimension  greater  than  one,  the  relationship  with  Newton's 
method  weakens  but  still  holds  to  some  extent. 

In  the  scalar  case,  we  let  /  be  a  smooth  function  from  the  real  numbers  to  the  real 
numbers  (or  the  complex  numbers  to  the  complex  numbers),  which  we  take  as  the  solution  to 
some  problem  (such  as  "evaluate  /").  As  the  condition  number  of  the  problem,  we  can  take 
in  principle  any  multiple  of  the  derivative  of  /,  but  the  one  which  will  turn  out  to  satisfy  the 
differential  inequalities  (1.1)  is  the  absolute  condition  number  Kabs(x)  =  ]f  (x)//(x)|,  the 
instantaneous  relative  change  in  the  output  per  absolute  change  in  the  input.  (We  want  to 
measure  the  absolute  distance  from  the  input  to  the  nearest  ill-posed  input,  so  this  condition 
number  turns  out  to  work  instead  of  km(x)  =  [/"'(x)  x//(x)|,  the  instantaneous  relative  change 
in  the  output  per  relative  change  in  the  input.)  The  set  of  ill-posed  problems  is  the  set  of  x 
where  k(x)=/'(x)//(x)  s  infinite.  Since/  is  smooth,  this  is  (in  general)  the  set  of  zeros  of/. 
Following  the  paradigm  used  so  far,  if  |k'(x)|  is  close  to  k2(x),  then  1/|k(x)|  should  be  a  good 
approximation  to  the  distance  to  the  nearest  ill-posed  problem,  i.e.  zero  of/.  Computing 

K,(x)  =    f(x)f(x)-(f(*))2  (8.i) 

aoo)2 

we  see  that  if  x  is  close  to  a  simple  zero  of/  so  that/(x)  is  small,  then 

|k'(*)|  -  i-^#i  -  w*»2 

(/(*)) 

as  required  by  the  paradigm.   Thus,  an  even  better  approximation  to  a  zero  of/  should  be 
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X__J_  =  ,_  HsL 

k(x)  /'(*) 

which  is  Newton's  method. 

If  /  is  smooth  except  for  poles,  then  these  poles  are  also  ill-posed  problems.  In  this 
case,  just  consider  g  =  l/f,  so  the  poles  of/  are  zeros  of  g.  The  condition  number  \g'/g\  of  g 
is  identical  to  the  condition  number  [f'/f\  of  /,  so  again  we  get  Newton's  method,  but  now  it 
converges  to  a  pole  instead  of  a  zero. 

Examining  a  little  more  closely  the  condition  that  k'(x)=(k(x))2,  we  see  that  for  this  to 
be  true  (/'(x))2  has  to  dominate  /(x)/"(x)  in  the  numerator  of  (8.1)  above,  or  at  the  very 
least  \f(x)f" (x)\<(f'(x))2.  But  this  is  just  the  condition  that  the  Newton  iteration  contracts. 
For  letting  g(x)  =  x-f(x)/f  (x)  be  the  Newton  iteration,  we  easily  compute 

8()    r  Mir  go 

What,  if  instead  of  applying  Newton  to  find  a  zero  of/,  we  apply  Newton  to  find  a  zero 
of  1/k(x)?  The  formula  is  easily  seen  to  be 

1/k(x) £(x} 

(1/k(x))'       '    X  ,(x)(l       /(X)f(x) 

CTM)2 

which  under  the  same  conditions  as  above  (f(x)  small  so  that      \f(x)f"(x)/(f'(x))2\<l)  is 

asymptotically  the  same  as  Newton  applied  to/(x). 

Thus,  the  property  of  condition  number  being  the  reciprocal  of  distance  to  the  nearest 
ill-posed  problem  is  quite  universal,  holding  locally  for  all  simple  zeros  and  poles.  For 
multiple  zeros,  it  is  easy  to  see  you  get  a  factor  of  the  multiplicity  of  the  zero/pole  in  the 
distance  estimate  from  the  paradigm,  and  this  is  just  the  usual  modification  to  Newton  for 
quadratic  convergence  to  a  multiple  zero/pole. 

The  situation  is  not  quite  so  simple  when  /  maps  between  higher  dimensional  spaces  of 
equal  dimension.  As  we  will  see,  the  steepest  ascent  direction  for  k  is  asymptotically  (D/)r-/, 
whereas   the   Newton    direction   is    (D/)-1/.    These   two    directions   can   be   quite   different, 
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especially  when  Df  is   ill-conditioned.   Nonetheless,  we  will  show   that  asymptotically   the 
paradigm  supplies  upper  and  lower  bounds  on  the  norm  of  the  Newton  correction. 

In  order  to  make  the  calculations  easier,  we  use  the  2-norm  for  vectors  and  Frobenius 
norm  for  matrices.  Letting  k(x)  =  ||D/(x)||/|[/"(x)||,  we  compute 


[tr(Df)T^-Df] 


Dk  = 


!fi JMJiiD£iL_x 


Ml  110/11  li/ii2    IId/II  M\  • 

In  the  neighborhood  of  a  zero  of  /,  the  second  term  dominates  the  first,  yielding 


llPKlJ  «  K2    \\{Dfff\\ 

"    "  \\Df\\\\f\\ 


It  is  easy  to  see  that 


js! „  -2  H(o/)7ll  ^  K2 


IID/H  W l\\  '  \Pf\W\fW 

so  that  according  to  our  paradigm,  provided  x  is  close  enough  to  a  zero  of/, 

1  =  JML  *  dist(,,P)  *  ||(d/)-||  M  =  UBfll  ll^"H   • 

K  IID/II  K 

The  norm  of  the  Newton  correction  is  bracketed  by  these  bounds  as  expected: 

JflLas  ||(D/)-'/||^  ||(D/)->||  urn   • 
Thus,  the  reciprocal  of  the  condition  number  provides  an  asymptotic  lower  bound  on 
the  distance,  and  can  underestimate  the  distance  by  a  factor  of  at  most  \\Df\\  ||(£>/)-1|l- 
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9.  Algebraic  Functions 

In  this  section  we  show  that  in  a  neighborhood  of  a  branch  point  of  any  algebraic 
function,  we  expect  the  distance  to  the  branch  point  to  be  at  least  a  multiple  of  the  square  of 
the  reciprocal  of  the  condition  number.  As  in  the  last  section,  for  utterly  general  functions 
the  most  we  can  prove  is  that  this  relationship  holds  in  a  neighborhood  of  a  branch  point,  not 
globally.  That  such  a  relationship  holds  globally  for  eigenproblems  and  polynomial  zero 
finding  depends  on  exploiting  the  special  structure  of  these  problems. 

We  define  an  algebraic  function  as  a  root  X  of  the  following  equation: 

0  =  2  PfW  *  -  p(Pi)  ■  (8-1) 

1  =  0 

Here  n  must  be  at  least  two  for  there  to  be  a  branch.  p,(x)  is  a  scalar  function  of  the  vector 

n 

variable  x.  By  P(zt)  we  mean  ^  z,  \'  where  z,  is  any  subscripted  quantity  (scalar,  vector,  or 

i  =  0 

n 

matrix).  Analogously,  we  let  P'(zt)  denote  2  z,  '  X'-1.    If  X.  is  a  particular  simple  root  of 

i=i 

(8.1),  then  we  can  compute  the  derivative  DX  of  X  with  respect  to  x  by  linearizing  as  follows: 
0=2  (X  +  6X)'  />,(*+ &x) 

i  =  0 

=  P(p)  +  P'(p,)8X  +  P{DPl)hx  +  0(||8x|P)    . 
Using  the  fact  that  P(p)  =  0  we  solve  for  8X  and  get 

~P(DPl) 

8X  =  ^-^-  8x 

P'(Pi) 

or 

fir,) 

leading  us  to  define 

k(x,X)  =  IIDXH  =  -^^    . 
This  condition  number  can  be  infinite  only  when  P'(pt)  =  0.  When  p,(x)  =  x,,  i.e.  we  have  a 
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simple  polynomial  with  coefficients  x,  (the  situation  analyzed  in  section  5),  this  condition 
P'(pi)  =  0  is  the  usual  condition  for  a  double  root.  In  x,  space,  the  set  of  points  where  P'(p,) 
vanishes  forms  a  branch  surface  rather  than  branch  point.  The  important  thing  to  notice  is 
that  since  the  p,  are  smooth,  k(x,X)  is  essentially  a  multiple  of  V\P'  (pjl  in  a  neighborhood  of 
a  branch  surface  (barring  accidental  cancellation  in  the  numerator  of  k(x,X). 

To  compute  Dk  we  proceed  as  above  by  linearing  k(x  +  8jc,A.+  8X).  The  result  of  this 
rather  tedious  calculation  is 

lln    II  -  II  (/>(Pp-))>    P(DDp') 

(P(Dp,))«     (P'(DPl))TP(DPl)    _    \\P(DPl)\\    P'JDP.) 
\\P(DPl)  ]r(p,)\P'(Pi)  \P'(Dp,)\    P'(DPl)    ' 

\\P(Dp,)\\    P"(p,)P(DPl) 
\P'(DPl)\      P'ipjP'ip,)     "• 

As  with  k  itself,  the  only  factors  which  contribute  to  ||Z?k||  going  to  infinity  are  the  P'(DPl) 
in  the  denominators.  The  first  term  has  one,  the  second  two  terms  have  two,  and  the  last 
term  has  three  factors  of  P'(DPl)  in  the  denominator.  Thus  we  expect  ||Dk||  to  grow  to 
infinity  no  faster  than  the  third  power  of  k,  and  barring  accidental  cancellation  in  the 
numerator  of  the  third  term,  it  will  grow  this  fast.  Applying  Lemma  2  in  section  2  with  3  =  3, 
we  see  that  a  lower  bound  on  the  distance  to  the  nearest  ill-posed  problem  will  be 
proportional  to  the  reciprocal  of  the  square  of  the  condition  number. 

Looking  back  to  the  sections  4  and  5  on  eigenvalue  problems  and  polynomial  zero 
finding  we  see  this  square  dependence  exhibited.  In  section  5  it  was  explicit  in  Theorem  12 
and  its  Corollary  1,  which  supplied  lower  bounds  on  the  distance  to  the  nearest  polynomial 
with  multiple  roots  as  a  multiple  of  the  square  of  the  condition  number.  In  section  4  one  must 
look  somewhat  closer.  The  condition  number  for  an  eigenvalue  X.  was  computed  to  be  the 
norm  of  its  associated  projector  \\P\\.  A  lower  bound  on  the  distance  was  computed  to  be 
proportional  to  sep/|lP||  (sep  and  ||/»||  are  defined  in  section  4).  At  first  glance,  the  lower 
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bound  does  not  appear  to  be  the  reciprocal  of  the  square  of  the  condition  number,  no  matter 
how  big  HP  ||  is.  In  this  case,  the  sufficiently  small  neighborhood  of  the  set  of  ill-posed 
problems  where  this  behavior  occurs  is  the  set  where  \\P\\  assumes  its  maximum  value,  which 
is  proportional  to  1/sep.  In  this  neighborhood  the  condition  number  behaves  like  1/sep  and 
the  lower  bound  like  sep2,  as  expected. 

Finally,  note  that  we  did  not  need  to  assume  that  the  p,  were  actually  polynomial 
functions  of  x,  just  that  they  were  smooth.  Thus  the  results  of  this  section  apply  to  a  larger 
class  of  problems  than  just  algebraic  functions. 
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10.  Extensions 

In  [9]  Kahan  relates  the  condition  number  and  the  distance  to  the  nearest  ill-posed 
problem  for^several  problems  of  numerical  analysis.  In  that  paper  he  also  observed  that  if 
one  had  an  ill-posed  problem,  then  by  restricting  it  to  lie  within  the  set  P  of  ill-posed 
problems  it  often  became  well-posed  again  in  the  sense  that  if  x  and  x  +  hx  were  both  in  P 
and  8x  were  small,  the  difference  between  the  solution  at  x  and  the  solution  at  x+8x  would 
also  be  small.  It  would  remain  well-posed  until  it  approached  a  subset  Px  of  P,  where  it 
again  become  ill-posed.  Restricted  to  lie  within  P,,  however,  it  again  became  well-posed  until 
it  approached  a  further  subset  P2,  and  so  on. 

For  example,  computing  the  pseudo-inverse  T^  of  a  matrix  T  is  equivalent  to  matrix 
inversion  for  square,  nonsingular  matrices,  in  which  case  the  relative  condition  number  of  T 
for  pseudo-inversion  can  be  written  as  al/a„,  where  o^a  •  •  •  £a,  are  the  singular  values  of 
T.  The  distance  to  the  set  P,  of  singular  matrices  in  the  ||-||f  norm  is  a„,  and  as  it  approaches 
zero,  the  condition  number  oyer,,  approaches  infinity.  If  a„  is  exactly  zero,  the  pseudo- 
inverse  is  well  defined,  and  as  long  as  T  is  restricted  to  have  rank  n-1  (cr„  =  0,  o-„_,*0),  the 
condition  number  of  its  pseudo-inverse  is  cr1/an_1,  where  an_,  is  the  distance  to  the  nearest 
matrix  of  rank  n- 2.  So  as  T  approaches  the  subset  P2  of  rank  n-2  matrices,  its  condition 
number  again  becomes  infinite.  If  T  is  restricted  to  have  rank  n-2,  its  condition  number 
becomes  a,/CTn_3,  which  remains  finite  until  T  approaches  the  set  P3  of  matrices  of  rank 
n-3,  and  so  on. 

In  the  course  of  establishing  the  results  of  the  last  paragraph,  Kahan  [9]  also  showed 
that  if  k(T)  =  ||r+||,  then 

rank(D  =  rank(r+S7"> 

which  is  analogous  to  our  differential  inequalities  (1.1).  Using  (10.1)  as  we  did  (1.1),  one  can 
find  another  proof  that  the  distance  within  the  set  of  matrices  T  of  constant  rank  *  to  the 
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nearest  one  of  rank  k—1  is  1  /  ||7"+  ||  =  ak. 

This  hierarchy  of  sets  of  ill-posed  problems  Pi  plays  an  important  role  in  numerical 
analysis  because  an  ill-posed  problem  can  often  be  regularized  by  restricting  it  to  lie  in  a  set 
Pj.  For  example,  when  solving  rank  deficient  least  squares  problems  one  often  regularizes  by 
artificially  forcing  the  smallest  singular  values  to  zero,  thus  solving  a  problem  forced  to  lie  in 
a  set  Pj  [5].  Similarly,  when  computing  eigenvalues  and  eigenvectors  one  often  computes  the 
invariant  subspace  belonging  to  a  cluster  of  eigenvalues  because  it  can  be  much  better 
conditioned  than  the  individual  eigenvectors  which  span  it.  This  is  an  important  technique 
when  computing  functions  of  matrices  [12]. 

In  a  future  paper  we  hope  to  extend  the  techniques  of  this  paper  to  estimating  distances 
to  hierarchical  subsets  Pt  of  other  ill-posed  problems.  This  should  lend  further  geometrical 
and  numerical  insight  into  the  solution  of  ill-posed  problems. 
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